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Abstract 

A  seminal  theorem  due  to  Blackwell  (1951)  shows  that  every  Bayesian  decision-maker  prefers 
an  informative  signal  Y  to  another  signal  X  if  and  only  if  Y  is  statistically  sufficient  for  X. 
Sufficiency  is  an  imduly  strong  requirement  in  most  economic  problems  because  it  does  not 
incorporate  any  structure  the  model  might  impose.  In  this  paper,  we  develop  a  general  the- 
ory of  information  that  allows  us  to  characterize  the  information  preferences  of  decision-makers 
based  on  how  their  margiual  returns  to  acting  vary  with  the  imderlyiug  (unknown)  state  of  the 
world.  Our  analysis  imposes  one  central  restriction:  we  consider  "monotone  decision  problems," 
whereby  all  decision-makers  in  the  relevant  class  choose  higher  actions  when  higher  values  of 
the  signal  are  realized.  We  show  how  this  restriction  can  be  exploited  to  characterize  infor- 
mation preferences  using  stochastic  dominance  orders  over  the  distributions  of  posterior  behefs 
generated  by  different  signals.  Of  particular  interest  for  appUed  modeling,  we  identify  condi- 
tions under  which  one  decision-maker  has  a  higher  marginal  value  of  information  than  another 
decision-maker,  and  thus  will  acquire  more  information.  The  results  are  appUed  to  oligopoly 
models,  labor  markets  with  adverse  selection,  hiring  problems,  and  a  coordination  game. 
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1     Introduction 

In  a  Bayesian  decision  problem,  an  agent  who  is  uncertain  about  the  true  state  of  the  world  must 
choose  an  action  after  observing  an  imperfectly  informative  signal.  BlackweU  (1951,  1953)  proved 
the  seminal  result  that  every  agent  faced  with  such  a  decision  problem  will  prefer  (ex  ante)  an 
informative  signal  y  to  another  signal  x  if  and  only  if  y  is  statistically  sufficient  for  x.  This  notion 
of  "better  information"  is  useful  and  intuitive,  but  it  is  also  quite  demanding  (as  noted  by  Blackwell 
himself).^  For  economic  modeling,  we  might  expect  that  sufficiency  is  far  stronger  than  needed  to 
compare  information  structures —  in  sharp  contrast  to  most  economic  models,  Blackwell's  theorem 
places  no  restrictions  on  the  decision-maker's  payoff  function. 

In  this  paper,  we  show  that  by  exploiting  properties  of  the  decision-maker's  preferences,  it  is 
possible  to  relax  sufficiency  and  derive  comparisons  among  a  richer  set  of  information  structures.  For 
many  different  classes  of  decision-makers,  we  derive  conditions  under  which  a  signal  y  is  preferred 
to  another  signal  x  by  all  members  of  the  class.  In  most  cases,  the  informativeness  order  can  be 
represented  as  a  stochastic  dominance  ordering  over  the  distribution  of  posteriors  that  might  arise 
from  each  information  source.  We  also  provide  conditions  under  which  one  decision-maJcer  wiU  have 
a  higher  marginal  value  of  information  than  another  decision-maker,  and  thus  will  acquire  more 
information.  The  results  are  apphed  to  oligopoly  models,  labor  markets  with  adverse  selection, 
hiring  problems,  and  a  coordination  game. 

The  starting  point  of  our  analysis  is  to  focus  attention  on  monotone  decision  problems,  each 
consisting  of  a  prior  belief  on  the  state  of  the  world  H{uj),  a  class  of  payoff  functions  U,  and  a 
set  of  admissable  signals  {£}.  The  class  of  payoff  functions  is  defined  by  how  the  decision-meiker's 
incremental  returns  to  taking  an  action  (a)  vary  with  the  state  of  the  world  {u>).  For  example,  in 
many  economic  problems,  it  is  assumed  that  the  incremental  returns  to  an  action  are  nondecreasing 
in  the  state  of  the  world  (i.e.,  the  payoff  function  ti(a;,a)  is  supermodular).  Given  such  a  class, 
it  is  possible  to  (partially)  order  posterior  beUefs  about  the  state  of  the  world  so  that  "higher" 
beliefs  induce  higher  actions  for  all  decision-makers  in  the  class.  For  a  signal  to  be  admissable 
to  a  monotone  decision  problem,  we  require  that  all  of  the  posteriors  that  might  be  generated 
from  different  realizations  of  the  signal  can  be  ordered  in  this  way.  To  illustrate,  for  the  class  of 
decision-makers  with  supermodular  payoffs,  higher  posteriors  in  the  sense  of  First  Order  Stochastic 


See  Blackwell  and  Girshik  (1954).  For  y  to  be  sufficient  for  x,  all  of  the  posteriors,  no  matter  hew  unlikely, 
generated  by  x  must  be  in  the  convex  hull  of  the  set  of  posteriors  generated  by  y.  There  are  many  unsatisfying 
examples  of  distributions  which  cannot  be  ranked  according  to  sufficiency.  Unless  a  signal  x  is  normally  distributed, 
X  cannot  be  more  informative  than  a  normally  distributed  signal  y  in  the  Blackwell  order.  See  Lehmann  (1988)  for 
further  examples. 


Dominance  (FOSD)  induce  higher  eictions,  and  the  signal  x  is  admissable  only  if  the  corresponding 
set  of  posteriors  can  be  totally  ordered  by  FOSD.  In  general,  different  classes  of  decision-makers 
will  induce  different  "stochastic  dominance"  orderings  over  posteriors  (for  instance,  if  the  marginal 
returns  to  acting  are  concave  in  the  state  of  the  world,  the  relevant  order  is  Second  Order  Stochastic 
Dominance  (SOSD)). 

We  show  that  a  natural  information  ordering  over  signals  is  available  for  monotone  decision 
problems.  We  find  that  a  signal  y  is  preferred  to  x  for  a  class  of  decision-makers  (and  thus  y  is 
"more  informative"),  if  the  "high"  posteriors  induced  by  y  are  (on  average)  higher  than  the  "high" 
posteriors  induced  by  x,  and  the  "low"  posteriors  induced  by  y  are  (on  average)  lower  than  the 
"low"  posteriors  induced  by  x.  The  terms  "high"  and  "low"  refer  to  the  stochastic  dominance  order 
induced  by  the  restriction  on  the  decision-maker's  payoff  function. 

To  fix  ideas,  return  again  to  the  example  of  supermodular  payoff  functions.  Recalling  that  the 
posteriors  of  the  signal  x  are  totally  ordered,  we  can  index  them  with  a  parameter  ctx,  so  that 
higher  values  of  the  index  correspond  to  higher  ordered  (by  FOSD)  posteriors.  So  that  this  index 
will  be  comparable  across  signals,  it  is  normahzed  to  have  a  uniform  distribution  ex  ante:  with 
probability  a^,  a  posterior  lower  (in  the  FOSD  order)  than  F{ii)\ax)  is  realized  after  observing  x. 
We  similarly  index  the  posteriors  G{u\ay)  generated  by  y.  Then  a  signal  y  more  informative  than 
x  for  all  supermodular  payoff  functions  if,  for  all  a  G  [0, 1], 

G{uj\ay  >  a)  tpoSD  F{uj\ax  >  a). 

The  expression  G{oj\ay  >  a)  represents  the  average  over  the  highest  1  —  ct  fraction  of  the  posteriors 
generated  by  y.  Thus,  a  signal  y  is  more  informative  than  x  if  high  realizations  of  w  are  more  likely 
when  highly-ranked  posteriors  are  realized.  For  other  classes  of  payoff  functions,  other  stochastic 
dominance  orders  (such  as  SOSD)  will  be  relevant  for  the  comparison  of  average  posteriors. 

The  conditions  we  derive  are  sufficient  for  all  decision-makers  in  a  given  class  to  prefer  one 
signal  to  another;  they  are  necessary  when  we  compare  small  (differential)  changes  in  the  signal 
structure.  We  also  show  that  the  theory  can  be  generalized  by  considering  orders  based  on  single 
crossing  properties  rather  than  stochastic  dominance. 

Our  second  objective  is  to  derive  conditions  under  which  one  decision-maker  will  have  a  higher 
marginal  value  for  information  than  another,  and  thus  will  acquire  more  information  when  infor- 
mation is  costly.  We  find  that  if  u  and  v  are  in  the  same  class  of  payoflF  functions,  and  the  two 
decision-makers  consider  purchasing  signals  ranked  according  to  our  criteria,  then  decision-maker 
u  buys  more  information  than  v  if  «'s  preferences  over  the  distribution  of  posteriors  when  using  an 
optimal  decision  rule  are  "more  sensitive"  than  u's.  The  meaning  of  "sensitive"  is  determined  by 


the  class  of  payoff  functions  under  consideration.  Although  our  conditions  depend  on  the  optimal 
policies  chosen  by  the  agents,  and  thus  are  not  primitive,  they  can  be  verified  in  many  applications. 

Our  results  are  related  to  work  in  statistics  by  Lehmann  (1988).  He  considered  one  specific  class 
of  monotone  decision  problems  —  those  where  the  decision-maker's  payoffs  satisfy  a  single  crossing 
property,  and  where  the  signals  satisfy  the  monotone  likelihood  ratio  property.  For  such  problems, 
Lehmann  (1988)  derived  a  new  information  ordering  that  relaxes  Blackwell's  sufficiency  criterion. 
Lehmann's  effectiveness  ordering  has  already  entered  economics  in  the  context  of  auctions  (Persico, 
1997),  principal-agent  problems  (Jewitt,  1997),  and  implicit  incentive  models  (Dewatripont,  Jewitt 
and  Tirole,  1997).  While  our  methods  are  quite  different  -  Lehmarm  uses  an  approach  based 
on  statistical  hypothesis  testing  -  we  obtain  the  eflPectiveness  ordering  as  a  special  case.  Closely 
related  to  Lehmann's  work  is  that  of  Persico  (1996),  who  studied  the  same  specific  class  of  decision 
problems.  His  paper  develops  an  approach  to  ranking  decision-makers  in  terms  of  their  incentives  to 
acquire  information."^  Our  approach  to  information  acquisition  builds  directly  on  his,  and  provides 
a  significant  generalization  to  other  classes  of  monotone  decision  problems. 

The  paper  develops  as  follows.  In  the  next  section  we  describe  the  model  and  briefly  review  the 
standard  approach  to  information  and  some  preliminary  results  on  stochastic  orderings.  In  Section 
3,  we  introduce  the  idea  of  monotone  decision  problems  (MDPs),  and  discuss  some  important 
classes  of  MDPs.  Section  4  includes  ovu:  main  theorems  on  ordering  information  structures,  and 
characterizes  the  monotone  information  order  for  several  classes  of  MDPs.  Section  5  presents  results 
on  ordering  payoff  functions  in  terms  of  their  marginal  value  for  information.  Section  6  gives  some 
economic  appUcations  —  to  information  gathering  by  firms,  adverse  selection  in  labor  markets, 
a  coordination  game  under  uncertainty,  and  a  hiring  problem.  The  last  section  discusses  some 
possible  extensions  and  concludes. 

2     The  Model 

2.1     The  Bayesian  Decision  Problem 

A  decision-maker  (DM)  who  is  uncertain  about  the  true  state  of  the  world  must  take  an  action 
after  observing  an  informative  signal.  The  state  of  the  world  is  denoted  a;  €  fi,  where  fi  C  R  is 
an  interval.  Let  V  denote  the  set  of  all  probability  distributions  on  Cl.  The  DM  must  choose  an 
action  a  €  A  C  R.  Her  payoff  u{u;,  a)  depends  on  both  her  action  and  the  true  state;  we  assume  u 


Persico  (1997)  further  established  that  similar  techniques  can  be  used  to  rank  the  revenue  to  the  auctioneer  under 
different  auction  formats. 


is  a  bounded  measurable  function  taking  f2  x  R  ^  R.  Throughout,  we  will  maintain  the  following 
assumption  about  the  set  of  available  actions. 

(A)  Either  A  is  finite,  or  yl  is  a  compact  interval  of  R  and  u(cj,  a)  is  continuous  in  o. 

The  DM  has  prior  distribution  H{uj).  Before  axiting,  the  DM  observes  some  informative  random 
variable  x,  with  support  Af  C  R,  and  forms  a  posterior  distribution  F{uj\x).  The  joint  distribution  of 
{u,  x)  is  then  written  F{uj,  x),  while  the  marginal  distributions  are  denoted  F{x)  and  F{lo)  =  H(uj).^ 
We  will  refer  to  F  as  an  "information  structure." 

Observe  that  many  different  information  structures  can  be  equivalent  from  the  perspective  of 
decision-making,  since  only  the  posterior  generated  by  a  signal  realization  affects  behavior  (the 
value  of  the  signal  does  not).  In  particular,  it  does  not  matter  if  the  DM  observes  x  or  some  T{x), 
so  long  as  F{u)\T{x)=T(x))  =  F{co\x  =  x)  for  all  x.  The  payofi'-relevant  features  of  an  information 
structure,  F,  can  be  uniquely  characterized  in  terms  of  the  probability  measure  induced  on  the  space 
of  posteriors  V,  which  we  write  as  fxp.^  We  will  begin  by  working  with  this  abstract  characterization 
of  an  information  structure  before  mapping  back  to  the  first  formulation  in  terms  of  the  joint 
distribution  F(uj,x). 

The  decision-maker's  problem,  given  a  posterior  distribution  P  G  7-',  is  to  solve 


max 
aeA 


f  u(aj,a)dP{aj)  (1) 


to  obtain  an  optimal  action  a*{P),  and  a  realized  payoff  u(a;,  a*  (P)).  We  define  the  (ex  ante)  value 
of  the  decision  problem  as 

V*(F,u)  =  [   [  u{cv,a*{P))dPiu)dfxAP)-  (2) 

JvJn 

2.2     The  Classical  Approach  to  Information 

The  classical  approach  to  information,  due  to  Blackwell  (1951,  1953),  begins  by  writing  V*{F,u) 


as 


V*{F,u)  =  f  {max  /  u{u},a{P))dP{u})\dfip{P)  =  /  u*{P)dnp{P).  (3) 

Jv  I  <P)  Jn  J  Jv 


'Note  that  F{ui)  =  H{u!)  is  needed  to  ensure  that  the  expectation  of  the  posterior  is  the  prior.  Taking  the 
joint  distribution  of  (w,x)  as  primitive  is  a  departure  from  the  analyses  of  Blackwell  (1951)  and  Lehmann  (1988), 
which  take  as  primitive  the  distribution  of  x|w.  This  means  that  their  information  orders  are  the  same  for  all  prior 
distributions  H.  Our  information  rankings  will  be  given  relative  to  a  fixed  prior  H.  Thus,  we  may  sensibly  consider 
averages  over  subsets  of  the  posterior  distributions. 

■*Tliis  construction  is  introduced  in  Blackwell's  original  article  (Blackwell,  1951).  Endow  V  with  the  weak  topology. 
Then  F(w\x)  is  measurable  with  respect  to  the  Borel  o--algebra  and  /Xp.(JE)  =  fx-piuMeB'^^^'''^' 


Using  revealed  preference,  it  is  straightforward  to  show  that  u*(P)  will  be  convex  in  P.   Simply 
observe  that  for  A  €  [0, 1],P^,P'^£  V, 

max  \x  f  udP^  +  (1  -  A)  /  udP'^  \  <  max  A  /  udP^  +  max(l  -  A)  /  udP"^.  (4) 

°-     V    Jo.  Jo.  )  °-       Jo.  "  7n 

Now  suppose  G  is  an  alternative  information  structure  which  induces  a  measure  ^q  over  poste- 
riors distributions,  and  that  ijlq  stochastically  dominates  fip  for  convex  functions,  i.e.  J^  fdfiQ  > 
Jp  ipdfip  for  any  convex  function  (p  :  V  -^  M..  Cleaily,  if  fiQ  stochastically  dominates  fip  for  con- 
vex functions,  then  V*{G,u)  >  V*{F,u)  for  any  payoff  function  u  and  action  set  A,  i.e.  every 
decision-maker  will  find  G  more  valuable  than  F;  Blackwell  (1951)  showed  the  converse  via  a  sepa- 
ration argument.  Thus,  Blackwell's  order  can  be  thought  of  as  a  multivariate  generahzation  of  the 
(perhaps  more  famiUar)  mean- preserving  spread  of  Rothschild  and  Stiglitz  (1970).  The  geometric 
representation  of  a  mean-preserving  spread  in  multiple  dimensions  can  be  formalized  using  what 
is  known  as  a  "dilation,"  so  that  /j,q  is  a  dilation  of  fip.^  It  can  further  be  shown  that  for  a  signal 
y  to  be  sufficient  for  x,  the  set  of  posteriors  generated  by  x  must  lie  in  the  convex  hull  of  those 
generated  by  y. 

Sufiiciency  can  be  understood  in  the  context  of  the  following  three  state,  two  signal  example.  Let 
Q,  =  {uii,U2,0J3},  where  wi  <  0^2  <  0J3,  and  suppose  the  prior  on  cv  is  imiform  (5, 5, 5).  Consider  a 
signal  y  that  is  equally  hkely  to  induce  posterior  beliefe  (|,  ^,  ^)  and  (0,  ^,  ^),  and  another  signal  x 
that  is  equally  likely  to  induce  posterior  beliefe  (^j  5,  ^)  and  (g)  ■^s  ^)-  As  illustrated  in  Figure  1, 
the  posteriors  generated  by  y  are  a  mean-preserving  spread  of  the  posteriors  generated  by  x.  So  y 
will  be  sufficient  for  x.  But  this  is  clearly  very  special,  since  y  can  be  sufficiency  ranked  relative  to 
another  signal  z  only  if  the  posteriors  generated  by  z  lie  on  the  line  between  (|,  |,  ^)  and  (0,  |,  ^). 
The  ranking  will  be  upset  by  even  the  sfightest  perturbation  of  z  leading  to  a  potential  posterior 
belief  off  of  the  fine.  This  lack  of  robustness  is  imappealing.^  Figure  1  also  highhghts  the  fact 
that  no  two-point  information  structiure  can  be  sufficient  for  y  (the  prior  is  fixed,  and  so  no  further 
spreading  is  possible).  In  spite  of  this,  y  is  clearly  not  uniquely  suited  to  making  all  types  of 
inferences.  For  instance,  y  provides  no  information  at  all  as  to  the  relative  HkeUhood  of  0^2  versus 
(V3.  And  while  y  provides  significant  information  as  to  whether  u  is  low  or  high,  many  other  two- 
point  information  structures  might  be  more  informative  on  this  question.  Since  it  seems  reasonable 

A  dilation  D  :  P  —*  Dp  is  a  mapping  from  V  into  the  set  of  probability  measures  on  V  such  that  /_  QdDp{Q)  =  P. 
That  is,  D  associates  with  each  P  £P  &  non-degenerate  probability  measure  on  V  with  mean  P.  For  our  problem, 
Ha  =  S-pDpdfj,p(P).  Strassen  (1965)  has  shown  that  other  stochastic  dominance  relations  also  have  this  sort  of 
representation. 

Indeed,  it  has  motivated  work  in  statistics  on  "approximate  sufficiency" ;  see  Le  Cam  (1964). 


to  suppose  that  particular  classes  of  decision-makers  —  especially  the  classes  of  decision-makers 
that  appear  in  economic  models  —  care  only  about  making  certain  types  of  inferences,  the  example 
suggests  that  it  should  be  possible  to  derive  other  informativeness  criteria. 

Unfortunately,  the  classical  approach  is  ill-suited  to  exploiting  any  structure  one  might  im- 
pose on  the  decision-maker's  preferences.  Most  important  from  our  perspective,  and  from  that 
of  many  economic  models,  are  assumptions  about  how  the  returns  to  taking  a  higher  action, 
r(u))  =  u{uj,a^)  —  u{a},a^),  depend  on  the  state  —  for  instance  r(uj)  might  be  increasing,  or 
concave,  or  polynomial  in  the  state  of  the  world,  or  positive  if  and  only  if  the  state  variable  is 
greater  than  some  critical  level  wq-  Using  the  above  approach  it  is  hard  to  see  how  to  incorporate 
such  a  restriction.  For  example,  suppose  one  restricted  attention  to  the  class  of  supermodular 
payoff  functions  (r(a;)  increasing  in  cv).  First,  it  is  not  clear  that  this  imphes  anything  about 
u*{cj,P)  =  u{u!,a*{P));  and  second,  even  if  it  did,  we  would  still  need  to  connect  this  property 
with  a  meaningful  characterization  of  "better"  information.  In  the  next  few  sections,  we  will  show 
that  by  placing  a  total  order  over  the  posteriors  (such  as  a  stochastic  dominance  order),  it  is  possible 
to  overcome  these  difficulties. 

3     Monotone  Decision  Problems  and  Stochastic  Orders 

3.1     Preliminaries:  Stochastic  Dominance  and  Single  Crossing 

Since  we  will  make  extensive  use  of  stochastic  orderings,  we  first  review  a  few  definitions  and  known 
facts.^  Assume  that  U  is  some  set  of  measurable  functions  taking  R"  — +  R,  and  let  P^,  P^  be  two 
probabiUty  distributions  on  R".  We  say  that  P^  stochastically  dominates  P^  with  respect  to  U, 
written  P"  ysD-u  P^  if 

f  u{z)dP"{z)  >   fu{z)dP^{z)      for  allueU.  (SD) 

If  U  is  the  set  of  nondecreasing  univariate  functions,  then  ysD-u  is  the  standard  First  Order 
Stochastic  Dominance  (FOSD)  relationship.  If  U  is  the  set  of  concave  univariate  functions,  then 
>~SD-u  corresponds  to  Second  Order  Stochastic  Dominance  (SOSD). 

Any  set  of  functions  U  induces  a  stochastic  dominance  order.  Some  critical  features  of  stochastic 
dominance  orders  for  our  purposes  can  be  understood  using  the  notion  of  a  closed  convex  cone,  as 
follows. 


^For  further  treatments  of  this  material,  see  inter  alia,  Karlin  and  Studden  (1966),  Karlin  (1968),  Jewitt  (1986), 
Border  (1991)  and  Athey  (1998a,1998b). 


Definition  1  A  set  U  is  a  closed  convex  cone  (ccc)  if  (a)  u,v  E  U  implies  that  au  +  ^v  E  U  for 
any  a,j3  >  0,  and  (h)  U  is  closed  under  the  weak  topology. 

If  /  udP^  >  J  udP^  for  all  u  €  U,  then  the  same  inequality  will  hold  for  any  function  v  in  the 
closed  convex  cone  generated  by  U  (denoted  ccc{U)).  Further,  since  P^  and  P^  are  probability 
distributions,  then  (SD)  holds  for  the  functions  «(z)  =  1  and  u(z)  =  —1  (denoted  {1,-1}).  Thus, 
the  set  U  generates  the  same  stochastic  dominance  order  as  ccc{U  U  {1,  —1}).^ 

A  weaker  notion  than  stochastic  dominance  is  that  of  stochastic  single  crossing.^    We  write 
P"  ^sc-u  P^  if 

fu{z)dP^{z)>0         ^  fu{z)dP"  {z)>0      foralluGC/.  (SC) 

If  {1,  —1}  €  U,  then  one  can  show  (Athey,  1998a)  that  )^sc-u  is  equivalent  to  )^sd-u-^^  If  U 
does  not  contain  constant  functions,  however,  )^sc-u  is  weaker  than  >^sD-u-  The  distinction 
between  stochastic  dominance  and  stochastic  single  crossing  will  become  relevant  in  our  analysis 
of  information  orders,  since  some  sets  of  payoff  functions  we  wish  to  consider  (most  notably,  single 
crossing  payoff  functions,  which  arise  in  auction  games  and  portfolio  problems)  do  not  contain  the 
constant  functions. 

The  stochastic  single  crossing  order  is  used  in  our  analysis  because  it  implies  a  comparative 
statics  prediction  about  monotonicity  of  the  optimal  policy.  ^^ 

Lemma  1  Let  U\  he  some  set  of  functions  taking  f)  — >  R.  Suppose  that  u{oj,a)  :  Jl  x  R  — >  E  and 
that  for  all  a^  >  a^,  u{oj,a^)  —  u{(jj,a^)  G  Ui.  Let  a*{P)  =  axgraayiaeA  jQu{oj,a)dP{u>).  Then 
P^  ysc-Ui  P^  implies  that  there  exists  a  selection  a*(P)  from  a*{P)  such  that  a*{P^)  >  a*{P^). 


In  fact,  just  this  insight  allows  us  to  characterize  stochastic  dominance  orderings,  since  we  can  also  consider  a 
the  order  induced  by  a  much  smaller  set  Eu  (for  the  case  of  nondecreasing  functions,  a  set  of  indicator  functions), 
loosely  referred  to  as  extreme  points,  so  long  bsU  =  ccc{Eu  U {1,  —1}).  See  Border  (1991)  or  Athey  (1998a)  for  more 
discussion. 

®  Various  definitions  of  single  crossing  properties  (using  different  combinations  of  weaJc  and  strict  inequalities)  have 
been  proposed  by  different  authors  in  economics  (Milgrom  and  Shannon,  1994;  Shannon,  1995)  and  statistics  (Karlin, 
1968).  The  definition  in  (SC)  is  not  the  most  natural  variation  for  comparative  statics  theorems.  However,  as  it 
involves  only  weak  inequalities,  (SC)  is  especially  appropriate  for  working  with  closed  convex  cones  and  stochastic 
orders. 
'"This  result  hinges  on  the  assumption  that  P"  and  P^  are  probability  distributions. 

To  simpUfy  the  notation  and  the  statement  of  the  result  we  consider  only  sufficiency  here,  but  with  minor 
qualifications,  the  single  crossing  condition  is  in  fact  necessary  for  comparative  statics  as  well.  See  Milgrom  and 
Shannon  (1994)  and  Shannon  (1995). 


Proof.  Define  a  function  v{T,a)  :  {0, 1}  x  R  — >  R,  with  v{l,a)  =  J^u{iJ,a)dP^{oj)  and  v{0,a)  = 
J^u{uj,a)dP^{u)).  Then  v{T,a)  is  bivariate  weak  single  crossing  in  {T,a)  and  by  Shannon  (1995), 
there  exists  a  selection  from  a*(T)  =  argmaxae^i;(r,  a)  that  is  nondecreasing.  D 

3.2     Monotone  Decision  Problems 

Our  approach  is  to  restrict  attention  to  particular  classes  of  decision  problems.  Each  class  will  be 
characterized  by  a  restriction  on  the  DM's  payoff  function  and  a  corresponding  restriction  on  the 
set  of  admissable  information  structures. 

Definition  2  The  pair  {U2,T)  constitutes  a  class  of  monotone  decision  problems  if  there  exists  a 
prior  H{ui)  and  some  set  Ui  of  bounded  measurable  functions  taking  Q,  into  R,  such  that: 

(MDP-U)  For  all  u  G  U2,  if  a^,a^  G  R  and  a"  >  a^,  then  u{u},a^)  -  u{u},a^)  G  Ui.  And 
moreover,  if  u  :  fl  xM.  —^  M.  is  some  bounded  measurable  function  with  incremental  returns  in 
Ui,  then  u  G  U2- 

(MDP-.F)  For  all  F  eT,  F{u))  =  if(cj),  and  further  ifx^,  x^  esupport(F)  and  x"  >  x^,  then 
F{uj\x^)  '^sc-Ui  F{uj\x^).  And  moreover,  if  F  is  an  information  structure  with  prior  H{u}) 
and  posteriors  completely  ordered  in  x  by  >~sc-Ui  >  then  F  E  T. 

If  {U21T)  satisfy  this  definition,  we  refer  to  them  as  an  MDP  pair.  The  first  condition,  (MDP- 
U),  implies  that  every  DM  under  consideration  has  incremental  returns  to  acting  that  rely  on  the 
state  in  some  pre-specified  way  (for  example,  they  might  be  nondecreasing  or  concave);  the  second, 
(MDP-T),  implies  that  for  every  admissable  information  structure,  the  induced  posterior  beliefs 
are  completely  ordered  in  the  sense  of  stochastic  single  crossing. 

What  do  these  restrictions  buy  us?  Let  {U2,f^)  be  an  MDP  pair.  By  (MDP-U)  and  Lemma  1, 
each  DM  u  E  U2  has  an  optimal  poUcy  a*{P)  that  is  monotone  in  P  when  posteriors  are  ordered 
by  yui-sc-  And  (MDP-T)  implies  that  if  F  G  .F,  then  every  posterior  associated  with  F  can 
be  ordered  in  this  way  and  indexed  by  x.  Thus,  each  DM  u  €  U2  takes  higher  actions  when  she 
receives  higher  realizations  of  the  signal. 

In  this  way,  the  MDP  restrictions  impose  an  ordinal  structure,  so  that  one  can  heuristically 
think  of  the  posteriors  generated  by  x  going  from  low  to  high  along  a  single  dimension.  However, 
the  goal  in  this  paper  is  to  compare  different  information  structures.  While  (MDP-T)  implies 
that  the  posteriors  for  a  given  signal  can  be  totally  ordered,  it  provides  no  guidance  as  to  how  to 
compare  the  posteriors  arising  from  different  signals.  Thus,  we  introduce  a  cardinal  index  for  the 
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posteriors  generated  by  each  signal,  a  €  [0,1],  so  that  we  may  "match"  for  comparison  posteriors 
from  different  signals.  This  is  accomplished  vising  a  strictly  increasing  function  Tp  :  X  -^  [0, 1],  so 
that  Tf{x)  =  a. 

For  an  important  set  of  cases,  we  will  show  that  it  is  appropriate  to  match  posteriors  according 
to  their  percentile  in  the  ex  ante  signal  distribution.  Formally,  for  an  information  structure  F{(jj,  x) 
with  marginal  distribution  F{x),  we  let  Tf{x)  =  F{x).  Using  this  index,  we  can  let  F{u>,a)  = 
F{LJ,F~^{a))  on  f2  X  [0, 1],^'^  so  that  F{oj\a)  is  the  "a-percentile  posterior."  The  probabiUty  of 
observing  a  posterior  below  F{uj\a)  in  the  >^Ui-sc  order  is  given  by  a,  so  that  F{a)  is  the  imiform 
distribution  on  [0, 1].  As  illustrated  in  Figure  2,  we  can  use  such  an  index  to  compare  the  realizations 
of  two  signals,  x  and  y,  according  their  a-percentile. 

Using  this  construction,  we  can  represent  a  decision-maker's  policy  function  a  :  [0,1]  —y  A  in 
terms  of  the  action  it  prescribes  for  an  a-indexed  posterior.  PoHcy  functions  a{a)  can  then  be 
analyzed  without  reference  to  a  particular  information  structure.  Using  this  representation,  for  any 
probability  distribution  F  onQx  [0, 1] ,  we  define  the  ex  ante  expected  value  from  using  the  policy 
a  by 

V{F,u,a)^  f      f  u{Lj,a{a))dF{u;,a).  (5) 

J  [0,1]  J  a 

An  important  consequence  of  (MDP-U)  follows:  for  any  nondecreasing  a{a), 

«(cj,  a{a"))  -  u{u},  a{a^))  €  C/i  for  a^  >  a^.  (6) 

This  in  turn  implies  that  u[uj,a{a))  viewed  as  a  function  of  (a;,  a)  can  be  extended  to  be  in  C/2. 
In  words,  whenever  the  DM  uses  a  monotone  poUcy,  the  incremental  returns  to  having  a  higher 
a-indexed  posterior  retain  the  properties  we  assumed  for  the  incremental  returns  of  the  primitive 
payoff  function,  and  thus  u{uj,a{a))  inherits  the  properties  specified  by  C/2- 
The  discussion  of  this  subsection  can  be  formally  summarized  as  follows. 

Theorem  2  Suppose  (C/2,^)  o-tc  an  MDP  pair  and  that  (A)  holds.  Consider  any  (u,F)  G  {U2, 
T).  Then  for  any  Tj?  :  Af  — >  [0, 1]  continuous  and  strictly  increasing,  F{uj,  a)  =  F(a;, Tp^{a))  G  J^. 
Further,  there  exists  some  nondecreasing  a^  :  [0, 1]  ^  A  s^lch  that: 

'^We  can,  without  loss  of  generality,  take  F{x)  to  be  continuous  and  strictly  increasing.  Lehmann  (1988,  p.  527) 
provided  a  construction  which  shows  that  one  can  always  take  F[x)  to  be  continuous.  If  F{x)  is  constant  o  /er  some 
interval  (xo,a;i],  notice  that  the  DM  would  experience  no  payoff  loss  from  observing  i*  =  x  on  x  <  xo,  x*  =  xo  on 
[xojXi]  and  x*  =  x  —  (ii  —  xo).  And  F*(x*)  will  be  strictly  increasing  (one  needs  a  more  involved  construction  if  X 
is  not  compact).  Then  F{x) :  A"  — »  [0, 1]  is  a  bijection  (F  is  continuous  and  strictly  monotone),  so  F  is  well  defined. 


(i)  For  all  a  G  [0, 1],  a^{a)  G  argmaxa^A  Jq  u{uj,a)dF{u>\a). 

(u)  V{F,u,a^{-))  =  V*iF,u). 

(iii)  For  any  a"  >  a^,  u{u},a^{a"))  -  u{oj,a^(a^))  G  Ui. 

(iv)   There  exists  w^  :  ^2  x  R  ^  R,  u^  G  U2,  with  u^{u},a)  =  u{u},a^{a)). 

Proof,  (i)  Because  A  is  compax;t,  for  all  P  G  7-*,  jQu{u},a)dP{u>)  attains  a  maximum  on  A. 
For  any  a^  >  a^,  monotonicity  of  T  and  (MDP-T)  implies  that  F{u}\a.^)  ysC-Ui  F{io\Q^). 
So  from  Lemma  1,  there  will  exist  a  nondecreasing  selection  from  the  set  of  optimizers,  denoted 
a^{Q).  (a)  By  (i),  J^u{iv,a^{a))dF{uj\a)  =  maxa^A  jQu{uj,a)dF{u}\Tf^{a)).  The  result  then 
follows  immediately  from  the  definitions  of  the  distributions,  (iii)  Let  a^  =  a^{a^),  a^  =  a^(a^) 
and  note  that  a^  >  a^ .  But  then  u{b),oF{a^))  -  u(a;,a^(a^))  =  u{u^,a^)  -  u{uj,a^)  G  Ui  by 
(MDP-U).  (iv)  The  function  u^(w,a)  can  be  defined  to  equal  u(a;,a^(a))  for  a  G  [0, 1],  to  equal 
u{uj,a^{Qi))  for  a  <  0,  and  to  equal  it(a;,a^(l))  for  a  >  1.  It  follows  from  (MDP-U)  that  u^  G  U2. 
D 

3.3     Examples 

Our  framework  covers  many  problems  of  economic  interest.  In  this  section,  we  discuss  four  common 
classes  of  problems  and  then  show  briefly  how  the  theory  can  be  applied  to  other  cases. 

Example  1.  Supermodular  (Incremental  returns  increasing  in  u)).  Suppose  Ui  is  the  set  of 
nondecreasing  functions.  Then  U2  is  the  set  of  functions  u{u},a)  that  are  supermodular  in  (w,a). 
In  words,  the  incremental  return  to  a  higher  action  is  nondecreasing  in  the  state  of  the  world. 
Nmnerous  economic  appUcations  take  this  form  (see  Milgrom  and  Roberts  (1990));  for  example,  a 
might  represent  the  level  of  investment  for  a  firm,  where  the  marginal  returns  are  indexed  by  oj. 
Since  Ui  contains  constant  functions,  we  have  P^  >~SC-Ui  P^  if  a^id  only  if  P^  is  higher  than  P^ 
according  to  FOSD,  denoted  P^  ^fosd  P^-  So  (MDP-T)  implies  that  ii  F  £  T,  the  induced 
posteriors  must  be  ranked  by  FOSD,  which  requires  that  F{uj\x)  is  decreasing  in  x:  a  higher  signal 
corresponds  to  a  higher  probability  that  the  state  of  the  world  is  high.  Figure  3  illustrates  the 
FOSD  ordering  over  posteriors  for  a  3-state  example. 

Exeunple  2.  Concave  returns  (Incremental  returns  concave  in  w).  Suppose  Ui  is  the  set  of  concave 
functions.  Then  U2  is  the  set  of  functions  ti(a;,  a)  that  have  concave  incremental  returns.  Such  a 
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payoflF  function  can  arise  naturally  in  binary  decision  problems  (a  G  {0, 1})  where  the  DM  must 
decide  whether  or  not  to  undertake  some  risky  venture  with  concave  payoff  r{uj)  (in  Section  5  we 
present  a  hiring  problem  of  this  type).  More  generally,  ii  u  E  U2,  then  the  DM  becomes  strongly 
more  risk  averse  as  she  raises  her  action.^^  For  this  case  P^  ysc-Ui  P^  if  and  only  if  P-'^  dominates 
P^  according  to  SOSD  {P^  ysosD  P^)'-  the  DM  takes  a  higher  action  when  the  posterior  about  uj 
is  "less  risky."  So  if  F  €  T,  then  F{uj\x^)  ^SOSD  F{lo\x^).  This  requires  that  F[a;|x]  is  constant 
in  X  and  for  any  oj,  J^^  F(u}\x)du  is  nonincreasing  in  x.  Alternatively,  for  any  x^  <  x^ ,  F{uj\x^) 
can  be  attained  from  F{uj\x^)  by  a  sequence  of  mean-preserving  spreads  (Rothschild  and  Stightz, 
1970). 

Example  3.  WSC(luo)  (Incremental  returns  weak  single  crossing  at  uq).  Suppose  Ui  is  the  set  of 
functions  r{aj)  such  that  r{u})  <  0  for  u;  <  a;o  and  r{u)  >  0  for  a;  >  a;o,  i.e.  the  functions  that  cross 
zero  from  below  at  uq.  We  say  such  a  function  satisfies  WSC(a;o).  Payoff  fimctions  in  this  class 
arise  in  the  context  of  investment  imder  uncertainty  problems,  where  uj  might  represent  the  return 
on  a  risky  asset,  cjq  is  the  return  on  a  risk-free  asset,  a  is  the  portfolio  weight  on  the  risky  asset, 
and  investor  payoflFs  are  given  by  v{auj  +  (1  —  a)wo)-  In  this  case,  if  P^,P^  have  densities  p^ ,p^ 
with  respect  to  Lebesgue  measure,  then  P^  ^sc-Ui  P^  if  and  only  iip^(uj)  —  ^ iV^Hp^iui)  satisfies 
WSC{uo)  ino)  (Athey,  1998b).  Thus  if  F  G  .;^,  this  reduces  to  ^Hg^j  <  (>)^^Ij  as  w  <  (>)a;o. 
A  high  signal  means  that  it  is  more  Ukely  that  the  true  state  u  is  greater  than  uq.  Since  the  class  of 
WSC(uq)  functions  does  not  contain  constant  functions,  a  stronger  condition  is  required  to  order 
posteriors  by  stochastic  dominance.  For  convenience,  define  the  set  of  functions  WSC{(jJq)  as  the 
set  of  functions  r(Lj)  satisfying  WSC{ijJq)  and  r{u}Q)  =  0.  We  have  P^  ^SD-Ui  P^  if  and  only  if 
P^((jS)  —p^{u))  is  WSC{uq).  In  particular,  the  densities  must  cross  at  wq, a  restriction  not  imposed 
by  >sc-Ui  ■ 

Example  4.  Weak  Single  Crossing  (Incremental  returns  single  crossing  in  to).  Suppose  U\  is  the 
set  of  functions  r((J)  that  cross  zero  from  below  at  some  point  cjq-  In  other  words,  Ui  is  the  imion 
of  all  the  WSC(a;o)  sets.  Payoflt  functions  u{u,a)  with  incremental  returns  that  are  single  crossing 
also  arise  throughout  economics  —  for  instance  in  bidding  and  pricing  problems.  ^^  For  the  class 
of  single  crossing  functions,  we  have  P^  >-sc-wsc  P^  if  and  only  if  P^  >~sc-wsc(wo)  P^  ^^^  ^ 
Wo,  which  impUes  that  (where  the  densities  exist)  pP{u)/p^{lJ)  is  nondecreasing  in  u).  This  order 


'^Recall  that  r"{Lj)  is  strongly  more  risk  averse  than  r^(w)  if  there  is  some  A  >  0  such  that  r^(a;)  —  Ar^(a»)  is 
concave.  See  Ross  (1981)  or  Jewitt  (1986). 

'■'See  Milgrom  and  Shannon  (1994)  for  an  extensive  discussion  of  the  single  crossing  property,  and  Athey  (1998b) 
for  applications  in  stochastic  problems. 
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has  received  a  great  deal  of  attention  in  economics  and  statistics,  and  it  is  known  as  the  Monotone 
Likelihood  Ratio  (MLR)  order  (P^  ^mlr  P^)-  Then  if  F  €  ^,  F{u\x)  must  be  ordered  by  MLR, 
and  if  a  density  exists,  f(uj\x^)/f{(jj\x^)  is  nondecreasing  in  ijj}^ 

Note  that  the  class  of  payoff  functions  with  single  crossing  incremental  returns  includes  the  class 
of  supermodular  payoff  functions  and  the  class  of  payoff  functions  with  incremental  returns  that 
are  WSC(cjo)-  But  expanding  the  set  of  payoff  functions  comes  at  a  cost  —  we  have  to  limit  the  set 
of  information  structures  that  we  can  attempt  to  compare.  For  instance,  requiring  the  posteriors  to 
be  ordered  by  MLR  is  significantly  stronger  than  requiring  the  posteriors  to  be  ordered  by  FOSD.^^ 
This  is  illustrated  in  Figure  3. 

More  generally,  our  theory  applies  if  U\  is  some  arbitrary  closed  convex  cone  of  functions,  such 
as  linear  functions  or  quadratic  functions  or  functions  with  positive  n*'^  derivatives.  For  any  such 
case,  there  is  a  rich  theory  of  stochastic  dominance  that  allows  one  to  characterize  the  relation 
ysD-Ui )  and  hence  the  relevant  restriction  on  posterior  beliefe,  in  terms  of  the  "extreme  points" 
of  the  cone  U\}^  We  leave  this  pursuit  to  the  reader. 

4     Ordering  Information  Structures 

This  section  contains  our  main  results  on  ordering  information  structures.  We  begin  by  identifying 
a  sufficient  condition  for  all  DMs  with  payoffs  u  G  U2  to  prefer  an  information  structure  G  G  J^ 
to  another  information  structure  F  gT.  We  call  this  condition  the  "monotone  information  order" 
(MIO)  for  (f/2,^);  it  can  be  characterized  using  the  stochastic  dominance  order  induced  by  Ui. 
We  show  that  (MIO)  is  not  just  sufficient,  but  also  necessary,  for  small  (differential)  changes  in 
the  information  structure  when  U\  is  a  closed  convex  cone  that  contains  the  constant  functions. 
Section  4.2  provides  a  general  analysis  of  informativeness  for  classes  of  MDPs  based  on  stochastic 


^'Athey  (1998b)  also  shows  that  the  set  of  log-supermodular  payoflF  functions  (where  u  is  positive  and 
u{uj,a^)/u{ij,a^)  is  nondecreasing  in  tS)  induce  the  same  stochastic  single  crossing  order,  >~sc-Ui,  despite  the 
fact  that  the  set  of  log-supermodnlar  functions  is  smaller  thsin  the  set  of  payoff  functions  with  incremental  returns 
that  are  single  crossing.  However,  the  set  of  log-supermodular  payoffs  does  not  satisfy  (MDP-  U),  and  thus  our  analy- 
sis below  does  not  apply  directly  to  this  class  of  payoffs.  To  see  an  example  where  log-supermodular  payoffs  arise, 
suppose  that  the  action  o  is  price,  firms  maximize  (o  —  c)D{a,iJ),  and  higher  states  of  the  world  correspond  to  more 
inelastic  demand. 

'*In  particular,  assuming  F  is  indexed  by  the  MLR  is  equivalent  to  assuming  the  F(a;|i)  is  ordered  by  FOSD  for 
all  prior  distributions  H{ui)  (Milgrom,  1981). 

^^That  is,  we  check  (SD)  for  some  set  of  functions  Eu  such  that  ccc{Eu  U  {1,  —1})  =  U.  See  Karlin  and  Studden, 
1966;  Border,  1991;  Jewitt,  1986;  Athey  1998a. 
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single  crossing.  Recall  from  Section  3.1  that  when  U\  does  not  contain  the  constant  functions, 
ysc-Ui  is  weaker  than  >~SD-Ui-  Thus,  the  ordering  based  on  stochastic  single  crossing  can  be 
used  to  compare  a  greater  variety  of  information  structures  when  Ui  does  not  contain  constant 
functions.  We  derive  Lehmann's  (1988)  effectiveness  order  as  a  corollary,  and  relate  our  work  to 
his.  Examples  are  provided. 

4.1     Monotone  Information  Orders  Using  Stochastic  Dominance 

We  begin  by  stating  a  sufficient  condition  for  G  to  be  more  informative  than  F  to  a  given  class  of 
decision-makers. 

Theorem  3  Let  (U2,T)  he  and  MDP pair  and  suppose  (A)  holds.  Consider  any  F,G  G  J^.  Then 
V*iG,u)  >  V*{F,u)  for  all  ueU2  if  for  all  a  G  (0, 1) 

G{cv\G{y)  >  a)  ^sd-u,  F{u;\F{x)  >  a).  (MIO) 

If  the  assumptions  of  Theorem  3  hold  and  (MIO)  obtains,  we  write  G  >^miO-Ui  F,  i.e.  G  is 
greater  than  F  in  the  monotone  information  order  induced  by  payoff  functions  with  incremental 
returns  in  Ui.  The  condition  (MIO)  is  easy  to  interpret.  It  says  that  the  high  posteriors  induced 
by  G  (where  "high"  means  G{y)  >  a)  are  on  average  higher  (according  to  stochastic  dominance) 
than  the  corresponding  high  posteriors  induced  by  F}^  This  condition  is  also  equivalent  to  saying 
that  the  low  posteriors  induced  by  G  axe  lower:  observe  that 

aF(u3\F{x)  <  a)  +  (1  -  a)F{cj\F{x)  >  a)  =  H{aj),  (7) 

where  H{ijj)  is  the  prior.  So  an  equivalent  expression  to  (MIO)  is  that  for  all  a  €  [0, 1], 

F{u\Fiy)  <  a)  ysD-u^  G{u;\G{x)  <  a).  (8) 

Notice  that  in  this  condition,  signals  are  implicitly  indexed  by  their  ex  ante  percentile.  This 
motivates  us  to  use  the  index  ax  =  F(x),  as  described  in  Section  3.2  and  illustrated  in  Figure  2. 
Thus,  throughout  this  subsection,  we  let  F{u},a)  =  F{u,F~^{a))  and  let  G{uJ,a)  =  G{cj,G~^{a)). 


'  An2ilogous  to  Blackwell,  one  can  interpret  (MIO)  as  saying  that  the  G  posteriors  are  "more  spread  out"  than  the 
F  posteriors.  To  see  this,  note  that  (MIO)  is  equivalent  to  saying  that  for  all  a  €  [0, 1], 


/    G(ij\G{y)  =  a)da  ^sD-Ui   f    F(cj\F{x)  =  a)da 
Jo  Jo 


which  can  be  interpreted  in  a  manner  similar  to  the  second  order  stochastic  dominance  condition  for  comparing  two 
distributions,  F"  and  F^,  on  [0, 1],  which  requires  that  /^  F"{z)dz  <.J^  F^(z)dz  with  equality  when  z  =  \. 
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Proof  of  Theorem  3.  We  show  that  (MIO)  actually  implies  a  stronger  result,  namely  that 
imder  (MIO),  for  any  u  G  C/2,  V{G,u,a)  >  V{F,u,a)  for  any  a{a)  nondecreasing.  This  will  imply 
that  G  is  more  informative  than  F  under  the  conditions  of  Theorem  3.  To  see  this,  observe  that 
a^{a)  €  axgmaxa  jQu{cv,a)dF{(jj\F{x)=a)  is  nondecreasing,  and  thus  by  revealed  preference 

V*(G,u)  >  ViG,u,a^)  >  V{F,u,a^)  =  V*{F,u). 

Suppose  that  A  =  {ai,...,an}  is  finite,  with  Oi+i  >  Cj.  Define 

ri{u))  =  u{u,ai+i)  -  u{uj,ai). 

Consider  some  arbitrary  monotone  increasing  policy  a  :  [0, 1]  — >  A.  We  can  find  ao  =  0  <  ai  < 
...  <  a„_i  <  q;„  =  1  such  that  a{a)  =  Cj  on  [ai_i,Q:i].  Then  we  have 

V{F,u,a)    =11      u{uj,a{a))dF{oj,a) 
Jn  J[Q,i] 

.     n 

=  J^u{uj,ai)  \F{ai \u;)  -  F(ai_i |a;)   dF{uj) 

=    E[u{u},a{)\  +  I   IYI  [«('^'«i+i)  -  w('^»ai)]    1  -  F(ai|w)]  i  dF(a;) 

n— 1  • 

=    E[u{u},ai)]  +  ^(1  -  ai)  /  ri{uj)dF{oj\ax  >  a,) 

n—l  - 

<    E[u{uj,ai)]  +  V(l  -  Oi)  /  ri{uj)dG{uj\ay  >  ai)      =  V{G,u,a). 
i=i  -f^ 

The  equalities  follow  by  algebraic  manipulation  and  Bayes'  rule.  The  inequality  follows  directly 
from  (MIO)  since,  by  (MDP-U),  we  know  that  for  each  i,  ri{u)  G  Ui.  The  case  of  A  compact 
follows  via  a  limiting  argument  and  is  deferred  imtil  the  next  section.  D 

The  idea  in  the  proof  is  very  simple.  Starting  from  action  ai,  every  new  increment  to  the  choice 
of  action  entails  a  new  (interim)  gamble  on  cj,  described  by  ri{uj)  =  u(w,aj+i)  —  u{ijj,ai)  G  Ui. 
Formally,  payoff's  for  a  given  action  can  be  written  as  the  sum  over  these  incremental  gambles: 

t-i 
u{uj,  tti)  =  u{u},  ai)  +  Y^  rj{Lj).  (9) 

3=1 

Jumping  up  to  a  higher  action  (say,  from  Oj  to  Oi+i)  at  some  posterior  indexed  by  F{x)  =  a^  means 
taking  on  the  gamble  for  all  higher-ranked  posteriors  as  well,  since  the  DM  uses  a  monotone  policy. 
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The  effect  on  ex  ante  payoffs  is  then  given  by  E[ri{uj)\F(x)  >  ai\.  Preferences  over  the  gambles 
ri(Lo)  are  described  by  'i^sD-Ui-,  and  (MIO)  requires  that  every  such  gamble  is  more  favorable  under 
G  than  under  F.  As  this  argument  applies  when  the  DM  uses  the  optimal  policy  for  signal  5,  there 
is  no  possible  way  for  a  decision-maker  in  the  class  to  do  better,  from  an  ex  ante  standpoint,  using 
F  instead  of  G. 

In  the  proof,  we  showed  that  (MIO)  implies  V{G,  u,  a)  >  V{F,  u,  a)  for  any  u  €  U2  and  a{a) 
nondecreasing.  Since  (MIO)  has  such  powerful  consequences,  it  might  seem  to  be  "too  strong"  as  an 
informativeness  order.  At  a  minimum,  it  seems  reasonable  to  compare  V{G,  u,  a^)  >  V{F,  u,  a^) 
only  for  policies  a^  which  are  optimal  for  information  structure  F.  However,  for  important  classes 
of  monotone  decision  problems,  we  show  that  restricting  the  policy  function  in  this  way  does  not 
permit  a  weakening  of  (MIO). 

To  see  why  this  might  be  true,  consider  the  simplest  case  of  A  =  {0, 1}  and  some  u  EU2-  The 
optimal  policy  for  u  (denoted  a^'")  entails  jumping  from  o  =  Otoa  =  lata  posterior  indexed  by 
F(x)  =  a^''^.  Now  consider  utility  functions  of  the  form  w{(j},a)  =  u{u},a)  +  aK,  where  ii!"  7^  0  is 
an  arbitrary  constant.  There  is  no  particular  reason  for  the  policy  of  jumping  to  a  =  1  at  a^'"  to 
be  optimal  for  w  imder  F;  however,  a  DM  with  payoff  w  who  does  use  a^'^  (suboptimally) ,  will 
prefer  G  to  F  if  and  only  if  the  DM  with  payoff  u  does  as  well: 

V{G, u, a^'"")  -  V{F, u, a^'")  -  [V{G,w, a^-")  -  V{F, w, a^-")]  (10) 

=    (1  -  a)  /  KdGiu\G{y)  >  a)  -  (1  -  a)  /  KdFiuj\F{x)  >  a)  =  0.  (11) 

Similarly,  a  DM  with  payoff  u  who  uses  a^''^  (suboptimally)  prefers  G  to  F  if  and  only  if  V{G,  w,  a^''^)  - 
V{F,w,a^'^)}^  To  extend  this  idea,  suppose  that  for  every  K,  wk  =  u  +  aK  G  U2-  Then  by 
varying  K,  we  potentially  can  recover  any  policy  of  the  form  a  =  0  for  F{x)  <  a,  a  =  1  for 
F{x)  >  a  as  the  optimal  policy  of  some  wk-  And  thus  for  all  payoff  functions  {wk  €  C/2}  to  prefer 
G  to  F  using  their  optimal  response  to  F,  it  will  be  necessary  that  V{G,  u,  a)  >  V{F,  u,  a)  for  all 
monotone  poUcies  a{a). 

This  heuristic  argument  can  be  made  exact  imder  the  following  condition: 

(Ui-C)Ui  =  ccc{UiU{l,-l}). 

'^Note  that  this  argviment  turns  critically  on  our  decision  (implicit  in  {MIO))  to  compare  G  to  F  using  (fixed) 
policy  functions  that  depend  on  a  posterior's  rank  in  the  ex  ante  distribution,  Oy  =  G{y).  Had  we  used  policy 
functions  a  :  [0, 1]  — >  A  based  on  some  other  index  (not  the  marginal),  the  policy  of  increasing  the  action  at  ay  =  a 
for  information  structure  G  would  no  longer  correspond  to  choosing  a  higher  action  when  G{y)  >  ai  and  (11)  would 
faa. 
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That  is,  Ui  is  a  closed  convex  cone,  and  it  contains  the  constant  functions.  Under  this  condition, 
r  €.  Ui  implies  that  r  +  K  E  Ui  ioT  any  K.  Thus  -  crucially  for  our  purposes  -  if  it  G  U2,  then 
u  +  aK  e  U2. 

The  following  theorem  establishes  that,  when  {U\-C)  holds,  then  for  small  (differential)  changes 
in  the  information  structure,  (MIO)  is  both  necessary  and  sufficient  for  a  ranking  of  information 
structures. 

Theorem  4  Let  {U2,T)  be  an  MDP  pair  such  that  (U\-C)  and  (A)  holds.  LetF^{co,x)  be  smoothly 
parametrized  by  6,  with  F^  E  T  for  all  9.  Then  ■^V*{F^,u)  >  0  for  all  u  €  U2,  if  and  only  if 
F'^""  yMio-u,  F'. 

Proof.  Sufficiency  follows  from  above.  We  prove  a  stronger  result,  that  (MIO)  is  necessary  for 
V(G,u,a^)  >  V{F,u,a^),  where  a^  is  the  optimal  policy  for  information  structure  F.  We  proceed 
by  assuming  that  (MIO)  fails  and  constructing  a  contradiction.  Suppose  G  )f-MlO-Ux  F.  Then 
there  is  some  a,  and  f  £  Ui  such  that 

/  f{uj)dGiuj\Giy)  <a)>  f  f{uj)dF{u\F(x)  <  a).  (12) 

Let  A  =  {0,1}.  Then  compute  Ka  =  E[f{uj)\F{x)  =  a],  and  define 

u{lo,  a)  =  a  (r(a;)  —  Ka)  ■ 

By  (f/i-C),  since  f  eUi,  then  u  GU2.  An  optimal  poficy  for  this  DM  is  to  set  a^{a)  =  1  if  a  >  a 
and  zero  otherwise.  Then  V{G,u,a^)  >  V{F,u,a^),  if  and  only  if 

/      f  u{u,a^{a))dGiu>,G-\a))>  f      f  u(uj,a^{a))dF(uj,F-\a))  (13) 

J[o,i]  Jn  J[o,i]  Jci 

f  [f(w)  -  Ka]dG{uj\G{y)  <  a)  <   f  [f{uj)  -  Ka]dF{u\F{x)  >  a)  (14) 

Jq  Ja 

But,  the  expected  value  of  a  constant  function  is  simply  that  constant.  So  (13)  and  (14)  contradict 

(12). 

For  the  case  of  small  changes,  consider  F^,  a  smoothly  parameterized  family,  and  let  a^{a) 

be  the  optimal  policy.  Then,  by  the  envelope  theorem,  ^  V{F^,u,a^)\g^a  =  ^  V{F^,u,a^) 

Then  apply  the  above  argument  to  compare  F^'^^  and  F^  at  the  policy  a^.  D 

It  is  also  possible  to  view  the  necessity  result  through  the  lens  of  statistical  hypothesis  testing. 
Consider  testing  the  null  hypothesis  Hq  :  r(a;)  >  0  against  the  alternative  Hi  :  r{u))  <  0,  for  some 
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r(a;)  G  f/i.  However,  we  impose  a  constraint  on  the  decision- maker:  the  test  has  an  average  size  of 
1  —  a.  That  is,  the  ex  ante  probability  (over  all  possible  signals)  of  accepting  the  null  hypothesis 
is  1  —  a.  A  decision-maker  facing  this  problem  would  then  solve: 


max 

a(x)G{0, 


/     f  a{x)r(uj)dF{uj,x) 

s.t.  /  a{x)dF{x)  =  1  —  a 

Jx 


Assuming  the  conditions  of  Theorem  4  apply,  an  optimal  policy  for  this  testing  problem  will  set 
a{x)  =  1  if  and  only  if  F{x)  >  a.  The  ex  ante  expected  payoff  will  be  (1  —  a)  •  E[r(uj)\F{x)  >  a], 
which,  by  (MIO),  is  larger  than  the  payoff  from  solving  the  inference  problem  associated  with  G, 
(1  —  a)  •  E[r{(v)\G{y)  >  a].  Thus,  (MIO)  can  be  interpreted  as  requiring  that  better  information 
allows  (on  average)  better  inference  about  the  returns  to  taking  a  higher  action.^" 

4.1.1     Examples 

We  now  revisit  two  of  our  examples  from  above. 

Example  1.  Supermodular  (Incremental  returns  increasing  in  uj).  Since  U\  is  the  set  of  non- 
decreasing  functions,  the  relation  ysD-Ui  corresponds  to  FOSD.  Thus  G  is  higher  than  F  in  the 
supermodular  monotone  information  order  (MIO-SPM)  if  and  only  if  for  all  a  G  (0, 1),  G{u}\ay  >  a) 
"^FOSD  F{u}\ax  >  oc).  That  is,  high  signals  from  G  lead  on  average  to  higher  posterior  beliefe  than 
high  signals  from  F.  Recall  from  above  that  if  F,  G  €  !F,  then  F{ij}\a),  G{ui\a)  are  increasing  in 
the  sense  of  FOSD  as  a  increases.  So  a  higher  signal  is  "good  news"  about  the  state  of  the  world. 
If  G  yMlO-Ui  F  then  high  signals  are  (on  average)  "better  news"  imder  G  than  under  F.  Since 
high  signals  lead  to  high  actions,  one  can  think  intuitively  about  G  bringing  about  a  better  match 
between  eictions  and  the  true  state  of  the  world. 

It  is  interesting  to  relate  {MIO-SPM)  to  sufficiency.  To  do  this,  we  return  to  our  earlier  three- 
state,  two-signal  example.  Recall  the  DM  has  uniform  prior  over  the  three  states  ui  <  (J2  <  £^^3.  As 
illustrated  in  Figure  4,  let  y  be  a  signal  that  put  equal  likelihood  on  posteriors  (|,  ^,  i)  and  (0,  \,  \). 
Note  that  y  is  admissable  for  supermodular  MDPs  since  (0,^,^)  ypoSD  (fj^)^)-  Consider  an 
information  structure  z  that  puts  equal  likelihood  on  posteriors  (§,5,0)  and  (0,  |,|).  Clearly  z 
is  also  admissable  for  supermodular  MDPs.  Further,  from  the  discussion  in  Section  3.3,  y  and  z 


To  see  why  this  alternative  approach  is  essentially  equivalent  to  our  necessity  proof,  let  A  be  the  Lagrange 
multipUer  on  the  constrained  optimization  problem  for  f,  a  (defined  as  in  the  proof).  Then  li  in  the  proof  equals 
a{f{w)  +  A). 
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cannot  be  compared  by  Blackwell's  sufficiency  criteria.  However,  all  supermodular  decision-makers 
prefer  z  to  y,  i.e.  z  )~mio-SPM  V-  The  reason  is  simple:  5's  posteriors  are  more  spread  out 
according  to  FOSD.  When  the  DM  sees  a  low  signal  realization  from  f ,  her  beliefs  about  the  state 
are  truly  pessimistic,  while  the  converse  holds  for  high  signal  realizations. 

Figure  5  illustrates  this  shift  in  a  different  way.  For  each  signal,  we  label  the  lower  (by  FOSD) 
posterior  L,  and  the  higher  posterior  H.  We  see  that  signals  that  are  higher  according  to  (MIO- 
SPM)  place  more  probabihty  weight  on  {u}2-,L),  and  {(jj^,H).  In  fact,  using  ideas  from  bivariate 
stochastic  dominance,  it  is  possible  to  show  that  (MIO-SPM)  can  in  general  be  characterized  this 
way.  Results  from  Meyer  (1991)  can  be  used  to  show  that  G  )~mio-spm  F  if  and  only  if  G  is 
obtained  from  F  using  a  "marginal-preserving  spread"  of  the  form  illustrated  in  Figure  3.^^ 

As  a  further  comparison,  consider  the  signal  structure  x  that  puts  equal  likelihood  on  posteriors 
(|,0, 5)  and  (0,  |,  5).  It  follows  easily  that  x  is  admissable  for  supermodular  decision  problems  and 
that  y,  z  yMlO-SPM  X.  Yet  no  two-point  information  structure  is  sufficient  for  x.  Moreover,  if  we 
perturb  x  so  that  it  generates  posteriors  (|  —  e,e  -f  77,  ^  —  77)  and  (e,  |  —  e:  —  77,  ^  +  7/)  with  equal 
probability  (where  e  >  0, 77  >  0,  and  e  -f  277  <  ^),  then  we  still  have  y  ^^mio-spm  x,  even  though 
the  two  signals  are  not  Blackwell  comparable  (illustrated  in  Figure  4).  So  there  is  a  sense  in  which 
our  information  order  is  robust. 

Our  examples  highhght  the  fact  that  when  there  are  three  or  more  states  of  the  world,  there  is  a 
significant  restriction  in  Blackwell's  requirement  that  the  posteriors  of  the  "bad"  signal  must  lie  in 
the  convex  hull  of  the  "good"  signal  posteriors.  If  there  are  only  two  states  of  the  world,  |ri|  =  2  (the 
state  of  the  world  is  a  "dichotomy"),  then  this  restriction  is  no  longer  severe,  since  all  posteriors 
can  be  summarized  in  a  single  dimension.  It  can  be  shown  that  on  dichotomies  (MIO-SPM)  is 
equivalent  to  sufiiciency. 

Example  2.  Concave  (Incremental  returns  concave  in  w).  When  U\  is  the  set  of  concave  functions, 
the  relation  ysD-Ui  corresponds  to  SOSD.  Recall  from  above  that  {MDP-T)  impUes  that  F{u}\a) 
becomes  "less  risky"  (by  SOSD)  as  a  increases.  So  G  >-mio-CV  F  if,  for  any  a,  G{u:\b.y  >  a) 
>-SOSD  F{uj\ax  >  a):  high  signals  lead  on  average  to  less  risky  posteriors  imder  G  than  under 
F.  Since  high  signals  lead  to  high  actions  and  high  actions  lead  to  interim  preferences  that  are 
more  risk-averse,  one  can  see  immediately  that  having  a  better  match  between  high  signals  and 
low  risk  will  be  valuable  to  the  DM.  We  give  an  example  in  Section  6  where  such  preferences  arise. 
Interestingly  (MIO-CV)  implies  sufiiciency  when  there  are  only  two  or  three  possible  states  of  the 


^'See  also  Levy  ajid  Paroush  (1974)  and  Shaked  and  Shantikumar  (1997)  for  more  discussion  of  the  stochastic 
dominance  order  associated  with  supermodular  functions. 
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world. 

Just  as  we  could  find  an  MDP  pair  (172,^)  corresponding  to  any  set  of  incremental  return 
functions  Ui  that  forms  a  closed  convex  cone,  we  can  describe  the  relevant  monotone  information 
order  condition.  Once  again,  >^mjo-Ui  says  that  if  the  average  posteriors  under  G  stochastically 
dominate  the  average  posteriors  under  F  (with  respect  to  functions  in  Ui),  then  G  is  more  infor- 
mative than  F  for  the  class  of  problems  in  {U2,J^)-  And  we  know  that  any  stochastic  dominance 
relationship  induced  by  a  closed  convex  cone  can  be  described  in  terms  of  the  extreme  points  of 
the  cone.^^ 

4.2     Monotone  Information  Orders  Using  Stochastic  Single  Crossing 

In  the  previous  section  we  indexed  posteriors  by  their  ex  ante  percentile  and  then  compared  infor- 
mation structures,  finding  that  informativeness  rankings  could  be  derived  using  famiUar  notions  of 
stochastic  dominance.  For  this  approach  to  be  "tight,"  we  required  Ui  to  contain  the  positive  and 
negative  constant  functions.  When  Ui  does  not  contain  the  constants,  but  does  contain  some  other 
function  f  and  its  additive  inverse  — r,  we  can  extend  our  results  using  stochastic  single  crossing  in 
place  of  stochastic  dominance. 

We  introduce  an  additional  pair  of  restrictions  (jointly  denoted  (R))  in  order  for  {U2,J^)  to  be 
admissable: 

(R-U)  Ui  is  a  closed  convex  cone,  and  there  some  f(a;)  >  0  such  that  r,  — r  G  C/i. 

(R-.F)  For  an  r  satisfying  (R-U),  for  all  F  E  T  with  associated  signal  x,  there  exists  6  >  0  such 
that  jB[r(a;)|x]  >  b  for  all  x  in  support(x). 

If  Ui  contains  the  constant  functions,  then  (R)  holds  with  f(uj)  =  1.  If  Ui  is  the  class  of  WSC(a;o) 
functions,  then  (R)  holds  with  r  =  l{u,!»tjo}  (that  is,  r  is  an  indicator  function  for  a  "small"  interval 
containing  uiq). 

We  begin  with  a  Lemma  that  generalizes  a  key  step  in  the  proof  of  sufficiency  above. 

Lemma  5  Let  U2  satisfy  (MDP-U)  and  assume  (A)  holds.  Consider  two  information  structures 
F  and  G,  where  F{u))  —  G(cj),  and  continuous  strictly  increasing  functions  Tp  :  X  ^^  [0,1], 


It  is  interesting  to  observe  in  passing  that  if  the  DM's  marginal  returns  are  both  nondecreasing  and  concave,  we 
can  concatenate  our  conditions  —  as  is  often  done  for  straightforward  preferences  over  gambles.  For  example,  such 
a  DM  would  be  better  oflF  fifter  a  sequence  of  two  changes  to  F{ij\ax  >  a),  one  that  makes  higher  states  more  Ukely 
(according  to  FOSD)  and  one  that  reduces  risk  (according  to  SOSD). 
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Tc-.y  -^  [0, 1].  Define  Tn(a;,a)  =  G{u},T°-\a))  -  F{u},Tp^{a)).  Then 

u{oj,a{a))dm(uj,a)  >0  (15) 


// 


In  J  [0,1] 
for  all  u  EU2  and  all  a  :[0,i\  -^  A  nondecreasing,  if  and  only  if  for  all  r  eUi  and  all  a  €  (0, 1), 

I  r{uj)d^m{co,a)<0.  (16) 

Jo, 

An  important  thing  to  note  is  that  m{-,a)  is  not  necessarily  a  probabiUty  distribution  on  u; 
checking  (16)  when  r{cv)  =  1  entails  checking  that  Pr(rG(y)  <  a)  <  Ft{Tf{x)  <  a).  Thus,  our 
choice  of  index  in  the  last  subsection  can  be  immediately  understood:  if  {1,-1}  €  Ui,  (16)  holds 
for  all  r  E  Ui  only  if  Pr(rG(y)  <  ct)  =  Pt{Tf(x)  <  a),  which  is  always  true  when  Tf{x)  —  F{x) 
and  Tciv)  =  G(y). 

When  f,  — f  G  C/i,  then  (16)  requires  that 

f  rd^m{uj,a)  =  0.  (17) 

Jn 

This  motivates  our  new  choice  of  indexing  functions,  Tp,  Tq: 

^-(^)-  /,r(.)dF(.)  -^^^^-wr-  ^''^ 

This  choice  of  Tf,Tg  guarantees  that  (17)  will  hold,  meaning  that  it  remains  only  to  identify  a 
condition  under  which  (16)  will  hold  for  all  r  G  C/i,  r  7^  r.  If  r  =  1,  then  Tf{x)  =  F{x)  as  above. 
If  Ui  is  the  class  of  WSC(a;o)  functions,  then  Tf{x)  =  F{x\uq).  In  general,  {R-J^)  implies  that 
Tp  :  X  —^  [0, 1]  is  strictly  increasing.^^ 

Theorem  6  Let  {U2,T)  be  an  MDP  pair,  suppose  (A)  and  (R)  hold.  Let  F,G  G  J^.  Then 
V*{G,u)  >  V*{F,u)  for  all  ueU2  if  for  all  a  e  (0, 1), 

G{oj\TG{y)  >  a)  ysc-u,  Fiu\TF{x)  >  a)  (MIO') 

where  Tf  and  To  are  defined  by  (18). 

{MIO')  generaUzes  {MIO)  by  requiring  that  the  average  posteriors  be  ordered  using  stochas- 
tic single  crossing  rather  than  the  (stronger)  stochastic  dominance  order.  To  interpret  this,  the 
stochastic  single  crossing  requirement  can  be  restated  as  follows:  for  each  a,  and  for  all  r  €  C/i, 

E[r{uj)\TG{y)  >  a]  >  0  =>  E[r{ui)\TF{x)  >  a]  >  0.  (19) 


And  continuity  can  be  ensured  (without  changing  the  information  content)  using  Lehmann's  (1988)  construction. 
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This  contrasts  with  stochastic  dominance,  which  requires  E[r{o:)\TG{y)  >  ot]>  E[r{u>)\TF{x)  >  a]. 
If  Ui  contains  the  constant  functions,  then  stochastic  dominance  and  stochastic  single  crossing 
coincide.  For  this  reason,  we  maintain  the  notation  G  >^mio-Ui  F  even  when  appljang  (MW). 

To  apply  Theorem  6,  it  is  useful  to  have  an  alternative  condition  for  (MW)  that  requires 
checking  only  a  single  inequality:  for  all  a  €  (0, 1)  and  r  eUi, 

E[r{uj)\TG{y)  >  a]  ■  Pr(TG(y)  >  a)  >  E[r{u})\TF{x)  >  a]  ■  Pt{Tf{x)  >  a).  {MIO') 

The  proof  of  Theorem  6  proceeds  by  first  establishing  the  equivalence  of  {MIO')  and  {MIO"),  and 
then  applying  Lemma  5  to  show  that  the  result  is  implied  by  (MIO"). 

Proof.  It  can  be  shown^"*  that  when  Ui  is  a  closed  convex  cone,  then  for  each  a,  (19)  holds  if  and 
only  if  there  exists  a  X{a)  >  0  such  that 

E[r{Lj)  \TG{y)  >  a]  -  X{a)E[r{ij)  \Tf{x)  >  a]  >  0.  (20) 

for  all  r  G  C/i-  (A(a)  can  be  interpreted  as  the  Lagrange  multiplier  in  the  problem  of  minimizing 
E[r{uj)\TG{y)  >  ci]  subject  to  the  constraint  that  E[r{uj)\TF{x)  >  a]  >  0,  and  the  left-hand  side 
of  (20)  is  the  minimized  value  of  the  objective).  But,  checking  (20)  for  r  and  — r,  both  in  Ui  by 
{R-U),  implies  that  A(a)  =  E[f{u>)\TG{y)  >  a]/E[f{cj)\TF{x)  >  a].  But,  the  latter  ratio  is  in  turn 
equal  to  Pr(rG(y)  >  Q:)/Pr(ri?(x)  >  a)  (simply  substitute  in  from  the  definitions  of  Ti?  and  Tg  in 
(18)).  So,  (20)  is  equivalent  to  {MIO").  To  complete  the  proof,  let  F{uj,a)  =  F{u},T^^{a))  and 
likewise  for  G.  By  Lemma  5,  {MIO")  is  equivalent  to  the  following:  for  all  m  G  f/2  and  all  monotone 
decision  policies  a  :  [0,1]  -^  A,  V{G,u,a{-))  >  V{F,u,a{-)).  So  for  all  u  G  U2,  if  a^  is  an  optimal 
(monotone)  poHcy  for  u  under  F,  V*{G,  u)  >  V{G,  u,  a^)  >  V*{F,  u).  D 

We  now  show  that  (MW)  is  tight  for  a  larger  class  of  decision  problems  than  that  identified  in 
Theorem  4  above. 

Theorem  7  Let  {U2,J^)  be  an  MDP  pair  and  suppose  (A)  and  (R)  hold.  Let  F^{oj,x)  he  smoothly 
parametrized  by  6,  vnth  F^  £  T  for  all  6.  Then  ■^V{F^,u)  >  0  for  all  u  G  U2,  if  and  only  if 
F'+'^  yMio-u,  F'. 

Proof.  As  in  Theorem  4,  we  prove  a  stronger  result,  that  (MIO)  is  necessary  for  V{G,u,a^)  > 
V{F,u,a^),  where  a^  is  the  optimal  policy  for  information  structure  F.  Suppose  G  )/-miO-Ui  F. 


*See  Jewitt  (1986),  Collier  and  Kimball  (1995),  or  Athey  (1998a)  for  alternative  proofs. 
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Using  the  representation  {MIO"),  this  implies  that  there  is  some  a,  and  r  E  Ui  such  that 

/  fiu;)dG{uj,Ta\a))  >  f  f{u:)dF{u:,T^\a)).  (21) 

Ju,  Jn 

Let  A  =  {0,1}.  Then  define 

Ka  =  E[riu)\TFix)  =  a]/E[f{u)\TF{x)  =  a]. 

The  denominator  is  non-negative  by  (R-J^).  Then  define 

u{LJ,a)  =  a{f{cj)  —  Kaf{uj)) . 

By  (R-U),  u  G  U2-  By  (MDP-T),  E[f{uj)  —  Kar{Lo)\TF{x)  =  a]  is  single  crossing  in  a,  so  that  an 
optimal  policy  for  this  DM  is  to  set  a^{a)  =  1  if  a  >  a  and  zero  otherwise.  Then  V{G,u,a^)  > 
V{F,u,a^),  if  and  only  if 

/       /  u{oj,a''{a))dG{u:,T^Hc^))  >  [       [  n{^,a''{a))dF{co,T^\a))  (22) 

J  [0,1]  Jn  J  [0,1]  J  a 

f  [f{uj)  -  Kafiu;)]dG{u,TaHa))  <  f  [f{uj)  -  K^f{ijj)]dF{u},Tp\a))  (23) 

JQ  Jn 

But,  we  have  defined  Tp  and  Tq  so  that  J^f{u)d^G{uj,TQ^{a))  =  J^r{u))d^F{u},T^^{a))  for  all 
a.  So  (23)  contradicts  (21).  As  in  Theorem  4,  necessity  for  the  case  of  a  smoothly  parameterized 
distribution  follows  from  an  application  of  the  Envelope  Theorem.  D 

If  {U2,T)  are  an  MDP  pair  corresponding  to  Ui,  and  F,G  E  J^,  the  posteriors  induced  by  each 
information  structure  are  not  required  by  {MDP-T)  to  be  totally  ordered  by  stochastic  dominance 
unless  Ui  contains  the  constant  functions.  If,  however,  they  do  happen  to  be  ordered  by  stochastic 
dominance,  we  can  show  that  (MIO)  and  (MIO')  are  equivalent. 

(MDP-SD)  For  all  F  E  T,     if  x",  x^  EsuppoH(F)  and  x"  >  x^,  then  F{u\x")  ysD-Ui 
F{uj\x^). 

Proposition  8  Let  (f/2,^)  be  an  MDP  pair  and  suppose  (A),  (R),  and  (MDP-SD)  hold.  Let 
F.GeT.  Then  (MIO)  is  equivalent  to  (MIO). 

Proof.  Note  that  J  fdF{uj\x)  and  —  j rdF{ijj\x)  are  both  increasing  in  x,  so  it  must  be  the  case 
that  j  fdF{u}\x)  =  JfdF{u))  is  constant  in  x.   And  likewise  for  G.    Simplifying  the  expressions 

22 


from  (18),  Tf{x)  =  F{x),  Tg{x)  =  G{x).  It  follows  immediately  that  (MIO")  (which  we  know  is 
equivalent  to  (MIO'))  is  equivalent  to  (MIO).  □ 

The  result  obtains  because  when  posteriors  are  ordered  by  stochastic  dominance  and  (R-U) 
holds,  £'[f|a;]  must  be  constant  in  x.  It  follows  that  Tf{x)  reduces  immediately  to  F{x),  our  earlier 
index.  In  contrast,  when  the  posteriors  are  ordered  by  ysc-Ui,  E[f\x]  can  vary  with  x. 

4.2.1     Examples 

We  now  interpret  the  monotone  information  orders  corresponding  to  Examples  3  and  4  from  above. 
In  the  process  we  relate  our  information  orders  to  Lehmann's  efficiency  criteria. 

Example  3.  WSC(ujo)  (Incremental  returns  weak  single  crossing  at  u>q).  Recall  that  if  r  is 
WSC(a;o),  then  r{uj)  <  (>)0  as  a;  <  (>)0.  So  r  =  l{o;«a;o}>  and  Tf{x)  =  F{x\ujo)  using  Bayes' 
Rule.  Then  (MIO')  reduces  to 

Pr(G(y|a;o)>a  |  oj)  <  FT(F{x\uJo)>a  \  u>)  for  co  <  loq 

Pr(G(y|a;o)>a;|t<;)  >  Pi{F{x\u}o)>a\oj)         for  u  >  ojq 

Alternatively,  F{F-'^(a\uJo)\uj)  -  G{G~'^{a\cJo)\uj)  is  WSC(ujo).  Recall  that  the  DM  is  essentially 
interested  in  knowing  whether  w  is  above  or  below  uq  —  and  (MDP-T)  ensures  that  high  signals  are 
"good  news"  about  u  being  above  uiq.  The  information  condition  implies  that  if  the  true  state  of  the 
world  u)  >  uJo,  then  the  probability  of  receiving  "good  news"  under  G  (defined  asy  >  G~-^(a|c<Jo))  is 
higher  than  under  F.  If  we  impose  the  additional  assumption  (MDP-SD),  it  follows  that  for  all  x, 
y,  f{u}o\x)  =  ^(wojy)  =  h{u}Q)  (the  prior).  Then,  conditions  (MIO)  and  (MIO')  are  equivalent,  and 
we  can  simply  check  that  g{Lu\G{y)>a)  —  f{u}\F{x)>a)  is  WSC(u!o)  for  all  a.  That  is,  conditional 
on  a  high  posteriors,  low  states  are  less  likely  and  high  states  are  more  likely  under  G  as  opposed 
to  F,  while  neither  signal  is  informative  about  the  state  uq. 

The  monotone  information  order  for  WSC(u>o)  can  also  be  interpreted  in  terms  of  hypothesis 
testing.  Consider  testing  the  null  hypothesis  Hq  :  u}  =  uq  against  the  alternative  Hi  :  u  <  loq. 
However,  the  test  is  constrained  to  have  a  size  of  1  —  a:  conditional  on  Hq,  the  probability  of 
rejecting  the  null  is  1  —  a.  Thus,  we  reject  when  F{x\aJo)  >  a,  and  likewise  for  G.  Since  MIO- 
WSC(u}q)  is  equivalent  to  requiring  that  F{F~^{a\u)o)\u})  —  G{G~^{q.\u}q)\u})  if  a;  >  (<)a;0)  we 
have  the  following  interpretation.  The  probability  of  rejecting  the  null  when  the  alternative  is  true 
(w  <  ujq)  is  greater  for  G  than  F;  and  the  probability  of  rejecting  the  null  when  the  alternative 
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is  false  (w  >  ujo)  is  smaller  for  G  than  F.  Thus,  the  hypothesis  test  using  Gis  uniformly  more 
powerful  for  a  given  size.  This  is  exactly  the  interpretation  given  by  Lehmann  (1988)  in  his  analysis, 
which  will  be  discussed  in  more  detail  below. 

To  see  a  very  simple  example  where  this  order  can  be  apphed,  return  to  the  portfolio  allocation 
problem  discussed  in  Section  3.3,  where  payoffs  axe  given  by  u{u,a)  =  v{cujJ  +  (1  —  a)ujQ).  For  this 
problem,  F  is  more  informative  than  G  if  it  provides  a  more  powerful  inference  about  whether  or 
not  investment  in  the  risky  asset  is  worthwhile. 

Example  4.  Single  Crossing  (Incremental  returns  single  crossing  in  uj).  The  monotone  informa- 
tion order  for  the  case  of  single  crossing  payoff  functions  and  signal  distributions  with  monotone 
Ukelihood  has  been  derived  by  Lehmann  (1988).  He  showed  that  G  is  more  informative  than  F  for 
this  class  of  problems  (G  )^l  F)  if  for  all  a;  G  A",  G~^{F{x\uj)\lj)  is  nondecreasing  in  uj?^  Jewitt 
(1997)  has  given  further  equivalent  characterizations.  We  now  show  that  Lehmann's  order  can  be 
obtained  as  a  special  case  of  our  result. 

Recall  that  we  noted  above  that  the  set  of  all  functions  that  cross  zero  once  from  below  at 
some  ijjQ  was  the  union  of  all  the  WSC(a;o)  sets.  By  quantifying  over  all  crossing  points  ujq,  we  can 
recover  the  information  preferences  of  DMs  with  singe  crossing  payoff  functions. 

Theorem  9  The  following  are  equivalent:  (i)  G  >l  F;  (ii)  G  >"Af/0-VKSC(i^o)  ^  /'^^  ^^^  ^^  ^  ^' 
(Hi)  G  ^MIO-SPM  F  for  all  prior  distributions  H{uj). 

Proof.  First  note  from  above  that  if  F{uj\x)  is  increasing  in  ysc-wsc(u}o)  ^^^  ^^^  liJo  as  x  increases, 
then  F{u}\x)  is  ordered  by  MLR..  And  similarly,  if  F{uj\x)  is  increasing  by  FOSD  as  x  increases  for 
all  priors  H,  then  F  is  ordered  by  MLR..  We  first  show  that  (i)  and  (ii)  are  equivalent.  Suppose 
G  yMlO-wsc(u,o)  ^-  This  means  that  F(F-^(a|a;o)|a;)  -  G{G-'^ {a\uo)\uj)  is  WSC(uo).  For 
WSC{a}o)  to  hold  for  every  ljq,  the  expression  must  be  increasing  in  cj.  Letting  x  =  F~^{a\ujo), 
and  taking  G~'^{-\uj)  of  each  term,  we  have  G-'^{F{x\oj)\cjj)  -  G-'^{F{x\ujo)\uo)  is  WSC{ujo).  This 
will  hold  for  every  ojq  if  and  only  if  G~^(F(x\u!)\lij)  is  increasing  in  cj,  which  is  exactly  Lehmann's 
condition.  Now  consider  the  equivalence  of  (ii)  and  (iii).  Suppose  G  )^mio-SPM  F.  Then  using 
the  definition  of  FOSD,  we  have  F{o}\F{x)  <  a)  <  G{oj\G{y)  <  a).  Applying  Bayes'  Rule, 
this  is  equivalent  to  F{F~^{a)\C:  <  ui)  <  G{G~^{a)\il)  <  io)).    Letting  x  =  F~^{a)  and  taking 

Lehmann  (1988)  also  considered  necessity,  although  his  theorem  formally  answers  a  slightly  diflFerent  question. 
He  finds  that  the  efficiency  order  is  necessary  and  sufficient  for  one  signal  to  return  higher  payoffs  than  another  in 
every  possible  state  (instead  of  on  average,  given  the  prior). The  discussion  following  Example  3  above  provides  an 
outline  of  Lehmann's  approach. 
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G  ^{■\Cb  <  Lo)  of  both  sides,  we  see  this  is  equivalent  to  G  ^{F{x\Cj  <  uj)\ui  <  oj)  <  G~^{F{x)). 
This  is  implied  by  G  >-l  F,  and  quantifying  over  all  two-point  priors,  it  implies  >-£,.  D 

To  see  how  Lehmann's  order  can  depart  from  the  supermodular  monotone  information  order 
in  practice,  return  to  Figure  4.  In  that  example,  recall  that  z  )^mio-spm  V  ^mio-spm  ^.  Using 
Lehmarm's  order,  z  >-£,  y,  but  x  does  not  satisfy  MLR,  so  Lehmann's  order  does  not  apply  to  it. 
Of  course,  none  of  the  three  are  ordered  by  sufficiency. 

The  analysis  can  be  extended  to  other  payoflF  classes.  For  instance,  it  follows  from  results  in 
Athey  (1998b)  that  when  U2  is  the  set  of  log-supermodular  functions  the  appropriate  information 
order  is  again  yL.  Or  in  another  example,  consider  the  set  of  payoffs  with  incremental  returns  that 
positive  only  in  some  intermediate  range  between  loq  and  lj'q,  where  cvq  <uJq.  This  case  works  out 
similarly  to  single  crossing  at  a  point. 

5     Ordering  Payoff  Functions 

We  now  provide  conditions  under  which  one  decision-maker  has  a  higher  incremental  return  to  im- 
proving her  information  than  another  decision-maker.  Persico  (1997)  has  investigated  this  question 
for  the  case  of  single  crossing  payoff  functions  and  has  shown  that  if  ■§^u{uj,  a^ip))—  ■§^v{u}^  a^{a)) 
is  single-crossing  in  to,  the  first  decision-maker  (u)  benefits  more  from  a  small  increase  in  infor- 
mation than  the  second  {v),  according  to  Lehmann's  information  order  for  single  crossing  payoflF 
functions.  We  generalize  this  result.  To  do  this,  we  first  Umit  our  attention  to  marginal  changes 
in  the  information  structure.  If  two  signal  distributions  are  linked  by  a  smoothly  parameterized 
family  of  distributions  which  is  information-ranked  all  along  the  way  from  F  to  G,  we  can  then 
compare  the  incremental  return  to  increasing  the  signal  strength  from  F  to  G. 

We  need  a  small  amount  of  notation:  let  9  be  a  convex  subset  of  R,  and  suppose  {F^{u),  x)  :  6  e 
6}  is  a  family  of  information  structures  smoothly  parametrized  by  9.  Let  aP''^{a)  be  a  nondecreasing 
selection  from  J^u{(x},a)dF^{aj\a),  and  let  u^(uj,a)  =  «(w,a^'"(a)). 

Theorem  10  Suppose  {U21T)  is  an  MDP  pair  and  that  the  family  {F^{u>,x)  lOeGjis  smoothly 
parametrized  on  Q,  with  F^  ^  T  for  all  6.  Let  u,v  be  bounded  measurable  payoff  functions.  If,  for 
somee,  u^{u,a)-v^{u},a)  E  U2,  then  ■§gV{F^,u)  >  ^V{F^,v)  at  6. 


Proof.  By  the  envelope  theorem,  we  have 

d 


-V{F',u)=  f   f     u(a;,a^'"(a))d 

«/Sz  •/[0,1J 


i,F^i^,a) 


(24) 
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So  letting  w{uj,a)  =  u{uj,a^'^{a))  —  v{u),a^''"{a))  and  m(ui,a)  —  ^F^(a;,a),  we  have 

-V{u-F')--^V{v;F')^  [   [     w{u,a)d^m{cv,a).  (25) 

'  Oif  J  CI  J  [0,1] 


de 


The  assumption  that  F^  is  increasing  in  ^miO-Ui^^  ^  increases  implies  that  for  any  r{uj)  G  Ui, 

a'e(0,i) 

I  r{uj)d^m{uj,a)>0.  (26) 

Jn 

Lemma  5  then  implies  that  (25)  evaluated  at  6  is  nonnegative.  D 

Theorem  10  says  that  if  u^  —  v^  is  in  U2,  i.e.  if  u^  is  "more  U2",  than  v^,  then  the  decision- 
maker with  payoff  function  u  has  a  higher  marginal  value  for  information  than  the  DM  with  payoff 
function  v.  The  theorem  does  not  require  that  u,v  E  U2,  or  that  the  marginal  value  of  information 
is  nonnegative  for  each  agent,  although  it  is  somewhat  hard  to  imagine  applying  the  Theorem  when 
this  is  not  the  case  (since  policies  might  not  be  monotone) . 

If  u  is  more  sensitive  to  information  than  v  in  response  to  every  signal  in  the  family,  we  can 
compare  changes  in  information  that  are  not  marginal.  Prom  this,  we  can  derive  comparative  statics 
on  the  amount  of  amount  of  information  acquired. 

Theorem  11  Suppose  the  conditions  of  Theorem  10  are  satisfied,  and  thatu^  {uj,a)—v^  {^,0()  € 
U2  for  every  6  €  @,  where  Q  is  a  closed  interval.  Let  C  :  0  — >  R  6e  the  cost  of  information;  and 
let  6*{u)  =  argmaxfige  V{6;u)  —  C{0).  Then  9*{u)  >  6*{v)  (in  the  strong  set  order). 

Proof.  Let  ^(7^;^)  =  V{u;F^),  and  ^(7^;^)  =  V{v;F^).  Applying  Theorem  10,  V{e;-f)  is 
supermodular  in  (^,7),  so  V(6,'y)  —  C{6)  is  supermodular  in  (^,7).  By  Topkis'  (1978)  Monotonicity 
Theorem,  0*{'y)  is  nondecreasing  in  the  strong  set  order.  □ 

A  few  words  are  in  order  about  the  results  of  this  section.  A  main  drawback  to  Theorems  10 
and  11  is  that  the  conditions  on  u  and  v  are  not  primitive,  since  they  depend  on  the  properties  of 
the  objective  function  evaluated  at  its  optimum.  In  general,  verifying  the  necessary  condition  may 
require  knowing  something  about  the  shape  of  the  optimizer,  or  about  the  curvature  of  the  payoff 
functions.  To  take  an  apparently  simple  example,  suppose  u{u},a)  =  v{u},a)  +  g{a),  where  v  is 
supermodular.  One  can  verify  that  sufficient  conditions  for  u  to  acquire  more  information  than  v, 
where  information  is  ranked  by  the  supermodular  information  order,  are  that  a^{a;u)  >  a^(a;v), 
that  a^{a;u)  >  a^{a;v),  and  v^aa  >  0,  (i.e.    marginal  returns  are  supermodular).    The  first 
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condition  will  hold  if  g{a)  is  increasing,^^  and  the  last  is  (reasonably)  transparent,  but  even  if  g  is 
linear,  the  second  condition  requires  significant  assumptions  on  the  curvature  of  u.  What  conclusion 
should  we  draw  from  this?  While  we  think  that  these  results  hold  promise  for  applied  modehng,  it 
may  be  necessary  to  place  a  fair  amount  of  structure  on  the  model  to  make  them  operational. 

On  the  other  hand,  the  existing  hterature  (prior  to  Persico  (1997))  provides  minimal  guidance  in 
this  area.  For  example,  an  increase  in  the  Blackwell  information  order  increases  the  ex  ante  expected 
value  of  any  function  which  is  convex  in  the  posterior  beliefs.  Thus,  agent  u  will  buy  more  Blackwell- 
ordered  information  than  v  if  and  only  if  w*(P)  -  v*{P)  =  J^  [u(a;,a*'"(P))  -  v{a!,a*'''{P))]  dP{u) 
is  convex  in  the  posterior  P{uj).  While  convexity  of  u*(P)  and  v*{P)  follows  as  a  simple  consequence 
of  optimality,  convexity  of  the  difference  is  not  at  all  transparent.  Checking  the  condition  would 
almost  certainly  involve  non-primitive  assumptions  similar  to  the  ones  described  in  this  section. 

6     Applications 

We  now  present  several  applications  of  the  results  developed  above.  The  first  two  applications  axe 
standard  decision  problems;  the  third  examines  adverse  selection  in  a  labor  market  equilibrium, 
while  the  fourth  studies  a  coordination  game.  The  examples  demonstrate  both  strengths  and 
weaknesses  of  the  approach.  In  particular,  while  comparisons  of  information  structures  may  be 
obtained  immediately  in  a  broad  range  of  cases,  comparative  statics  on  information  acquisition 
often  require  additional  assumptions. 

6.1     Application:  Cost  Uncertainty  for  Producers 

A  growing  literature  in  Industrial  Organization  considers  the  value  of  information  to  firms  in 
oligopoly  models  (see  for  instance  Mirman,  Samuelson,  and  Schlee  (1994)  and  references  therein). 
The  methods  outUned  above  allow  for  some  new  results  in  this  general  class  of  problems. 

Consider  the  problem  faced  by  a  producer  who  must  choose  quantity,  q,  to  maximize  some  ob- 
jective. We  will  compare  the  importance  of  information  to  firms  under  diflterent  market  structures. 
However,  we  will  restrict  attention  to  covert  information  gathering,  whereby  we  hold  the  strategies 
of  other  agents  fixed  when  we  analyze  the  eflFects  of  gathering  information. 

Write  the  firm's  gross  surplus  as  a  general  function,  R{q;  /3),  where  (3  parameterizes  the  market 
structure.  In  particular,  we  will  consider  /3  €  {M,D,S},  representing  monopoly,  duopoly,  and  the 
social  planner's  total  surplus.  The  firm  faces  imcertainty  about  its  cost:  the  total  cost  of  producing 


To  see  this,  let  u{ui,a,T)  =  i;(ai,a)  when  t  =  0,  and  v{uj,a)  +g{a)  when  t  =  1.  Then  u  is  supermodulax  in  {a,T) 
for  every  w,  so  a*(a,T)  will  be  increasing  in  (a,T).  That  is,  o*(a;«)  >  a'{a;v)  for  every  a. 
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q  is  given  by  C{q;uj).  Thus,  the  firm's  profits  are  given  by  Tr{q,uj;P)  =  R{q\P)  —  C{q;uj).  We 
consider  a  smoothly  parameterized  family  of  information  structures,  {F^{u},a)  :  ^  €  G},  where 
F^{a)  =  a  for  each  6  and  all  a  e  [0, 1].  Expected  profits  given  an  a-percentile  posterior  are 
denoted  tt (g,  a;  P)  =  R{q;  /?)  —  E[C{q, oj)\q]-  Assume  that  each  of  these  functions  is  differentiable  in 
q.  Under  appropriate  assumptions,  the  optimal  choice  of  quantity  is  then  determined  by  MR{q;  P)  = 
j-^R{q;P)  =  f^E[C{q,u)\a]=E[MC{q,u:)\a\. 

Suppose  that  6  indexes  F^(a;,  a)  according  to  {MIO-SPM),  and  that  a  indexes  F^{ui\q)  accord- 
ing to  FOSD  for  all  6.  We  first  apply  our  analysis  from  above  to  characterize  increasing  information 
for  this  problem.  If  we  assume  that  MC{q;  a;)  is  decreasing  in  cu,  then  7r(g',  uj;  P)  is  supermodular  in 
{q,(^).  Since  {MDP-T)  is  satisfied,  the  optimal  choice,  q^{a;6),  will  be  nondecreasing  in  a.  Better 
information  (according  to  (MIO-SPM))  leads  to  higher  ex  ante  expected  payoff's. 

We  now  turn  to  consider  how  the  marginal  value  of  information  changes  under  different  market 
conditions.  We  begin  with  the  following  result. 

Proposition  12  Consider  the  model  described  above,  where  for  each  9,  a^  indexes  F^(a;|a^)  ac- 
cording to  FOSD.  Assume  that  the  smoothly  parameterized  family  {F^(a;,a)}  is  ordered  by  MIO- 
SPM.  In  addition,  assume  that  MC{q;  u)  is  decreasing  in  w,  increasing  in  q,  and  submodular  in 
(uj,q).  Finally,  assume  that  q^{a)  >  q^{a)  and  q^'{a)  >  q^'{a).  Then  the  marginal  value  of  6  is 
higher  for  firm  p^  than  firm  P^. 

Proof.  By  Theorem  10,  it  suffices  to  show  that  ■Kg{q^{a),uj;P^)q"'{a)-TTq{q^(a),u);P^)q^'{a)  is 
nondecreasing  in  w.  If  g^' (a)  >  ^^'(q:),  then  our  result  obtains  if  7rq(g^ (a),  w;/3''^)—7rg(9^ (a), a;;  ^^) 
is  nondecreasing  in  u  (since  each  terra  is  separately  nondecreasing).  Recall  that  'nq[q^(a),w,P)  = 
MR{ql^{a);  P)  —  MC{q^{a),u)  for  each  firm.  Since  the  first  term  does  not  vary  with  w,  we  can  con- 
sider whether  MC{q^{a),Lo)  —  MC{q^{a),u))  is  nondecreasing  in  u,  which  follows  since  MC{q,cj) 
is  submodular.  □ 

To  interpret  the  conditions  on  the  cost  function,  note  that  they  axe  satisfied  if  C{q,u))  =  q^/u). 
To  interpret  the  conditions  on  the  quantities  chosen,  which  of  course  are  not  primitive  conditions, 
it  is  useful  to  consider  some  examples.  First  consider  comparing  a  monopolist's  problem  with  that 
of  the  social  planner.  Let  R{q;  P^)  =  P(g)  •  q  and  R{q;  p^)  =  J^  P{t)dt. 

As  usual,  it  follows  directly  that  q^{a)  >  q^(a),  that  is,  the  monopolist  imderprovides  quantity. 
Now  consider  the  terms  g^'(a)  and  q^'{a).  The  implicit  function  theorem  yields 

«/,,_ £E[MC{qP{a),u;)\a] 


/'(a)  = 


^^MR{qP{a);P)  -  ^^E[MC{qncc),u)\a\ 
28 


Thus,  q^'{a)  >  q^'{a)  if  MC{q,uj)  is  submodular  in  (g,a;),  if  MC{q,oj)  is  convex  in  q,  and  if 
P'(q^{a))  >  2P'{q^{a))+q^{oL)P"{q^{a)).  If  P  is  linear  or  convex,  a  sufficient  condition  for  the 
latter  inequality  is  the  commonly  assumed  requirement  that  the  marginal  revenue  curve  is  steeper 
than  the  demand  curve. 

We  can  also  compare  the  value  of  information  for  a  monopoUst  and  a  duopolist.  We  suppose  that 
both  duopolists  have  the  same  initial  signal  structure  as  the  monopolist,  but  only  one  duopoUst. 
has  the  opportunity  to  (covertly)  purchase  additional  information  (it  is  also  possible  to  study  the 
information  acquisition  game,  but  we  will  not  consider  that  case  here).  With  these  symmetry 
assumptions,  the  result  q^{oi)  >  q^{oi)  obtains  so  long  as  marginal  revenue  is  downward  sloping. 
Further,  q^'{a)  >  q^'{a)  if  marginal  cost  is  submodular,  and  if 

2P'{q^{a))  +  q^{a)P"iq^{a))  >  2P'(2g^(a))  +g^(a)P"(2g^(a)). 

This  condition  is  satisfied  for  the  constant  elasticity  demand  curve. 

Summarizing,  we  see  that  imder  several  reasonable  conditions,  the  social  planner  has  a  larger  in- 
centive to  acquire  information  than  the  monopolist,  but  the  monopolist  may  have  a  larger  incentive 
to  acquire  information  than  the  duopoUst. 

6.2     Application:  Screening  for  a  "TEu-get"  Hire 

An  entrepreneur  or  manager  needs  to  hire  a  worker  or  subcontractor  to  perform  a  very  specific 
task.  She  interviews  an  interested  party,  who  appears  to  have  roughly  the  right  characteristics. 
The  manager  now  has  to  make  a  single  take-it-or-leave  it  monetary  offer  based  on  information 
gleaned  from  the  interview.  What  sort  of  signal  from  the  interview  will  cause  the  manager  to  make 
a  large  oflFer?  And  what  would  it  mean  for  the  screening  process  to  be  more  effective? 

To  model  such  a  situation,  we  assume  that  there  is  some  optimal  task  outcome  co*.  Hiring 
the  agent  would  result  in  outcome  w,  and  a  payoff  of  r{iv),  which  achieves  a  maximum  at  uj*,  and 
decreases  as  co  moves  away  in  both  directions.  So  r(w)  is  concave  (but  not  necessarily  symmetric — 
it's  not  just  the  distance  from  perfection  that  matters,  but  the  direction).  A  wage  offer  a  will  be 
ax:cepted  with  probability  p{a),  which  is  increasing.  The  manager  has  a  signal  x  about  cu  (arising 
from  an  information  structure  F{u},x)),  and  chooses  a  to  maximize  p{a)E[r(u})  —  a\x].  We  can  write 
the  payoff  function  u{u!,a)  as  u{u},a)  =  p{a)[r(oj)  —a].  In  order  for  signals  to  be  ordered  in  such 
a  way  that  the  manager  will  offer  more  money  following  a  "good"  signal,  regardless  of  the  exact 
shape  of  p  and  r,  it  must  be  the  case  that  x  orders  the  posteriors  by  SOSD  (see  Example  2  in 
Section  3.3  for  interpretations).  K  the  potential  signals  from  the  interview  satisfy  this  condition, 
a^{x)  will  be  monotone  regardless  of  the  exact  shape  of  p  and  r. 
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Now  for  the  question  of  what  constitutes  better  information.  Applying  (MIO-CV),  G  will  be  a 
more  informative  interview  that  F  if  Va  €  [0, 1] , 

G{oj\G{x)  >  a)  ^sosD  Fiuj\Fix)  >  a). 

The  intuition  is  simple:  after  the  interview  the  expected  outcome  if  the  agent  is  hired  is  always  u)*, 
but  there  is  some  residual  uncertainty.  The  manager  does  not  like  this  residual  risk — in  particular, 
the  less  risk  she  believes  there  is,  the  more  aggressively  she  will  pursue  the  candidate.  A  more 
informative  screening  process  is  one  where  high  signals  (which  are  more  likely  to  lead  to  hires) 
imply  less  residual  risk. 

6.3     Application:  Adverse  Selection  and  Labor  Markets 

Our  next  example  suggests  how  our  information  orders  can  be  applied  to  adverse  selection  mar- 
kets. Consider  the  following  stylized  situation.  Workers  hve  for  two  periods  and  spend  the  first 
period  training  (in  school).  Each  worker's  productivity  is  unknown  with  prior  H(uj),  and  schooling 
generates  a  noisy  signal  x  about  underlying  ability.  Suppose  that  the  joint  distribution  of  (u),  x)  is 
F{uj,x).  This  signal  is  observed  by  the  school,  but  not  by  the  general  labor  market.  Instead,  only 
the  top  1  —  a  fraction  of  the  class  "graduates,"  while  the  rest  do  not  —  the  labor  market  observes 
only  if  a  given  worker  graduated.  Each  firm  has  production  function  J{uj)  ,  increasing  in  u,  and  the 
labor  market  is  competitive  so  that  workers  receive  a  wage  £J[J(cj)|J],  where  3  is  the  information 
available  to  the  market  about  productivity. 

The  market  wage  for  graduates  will  be  E[J{u))\F{x)  >  a],  and  for  non-graduates  E[J{uj)\F{x)  < 
a] .  Now  suppose  that  the  schooling  process  becomes  more  informative  about  the  worker's  abiUty, 
in  the  sense  that  F  increases  to  G  in  the  supermodular  monotone  information  order  (MIO-SPM). 
It  follows  immediately  that  the  wage  for  graduates  will  increase  since 

E[J{uj)\G{y)  >a]>  E[J{u)\F{x)  >  a], 

while  the  wage  decreases  for  those  who  do  not  graduate.  The  point  is  that  the  labor  market 
interprets  a  failure  to  graduate  as  bad  news  about  ability,  and  as  worse  news  when  the  schooling 
process  is  more  revealing.  Note  that  the  average  wage  (and  average  production)  for  the  whole 
economy  is  constant  at  £'[J(a;)],  but  inequality  increases  with  information. 

To  extend  this,  suppose  now  that  a  fraction  1  —  /?  of  workers,  /?  <  a  can  be  hired  into  jobs  that 
are  "skill-sensitive"  —  that  is,  have  production  function  5(a;),  with  S"(w)  >  J'{u>)  for  all  u.  And 
suppose  that  E[S{u!)\  =  E\J{ijj)\  so  that  "on  average"  productivity  is  the  same  at  the  two  jobs. 
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Assuming  that  each  worker  gets  her  expected  marginal  product  as  a  wage,  the  equiUbrium  in  this 
two-job  labor  market  has  a  fraction  j3  of  workers,  all  of  whom  graduated,  going  to  "skill-sensitive" 
jobs  at  a  wage  E[S{oj)\F{x)  >  a],  the  remaining  graduates  taking  less  skill-sensitive  jobs  at  a  lower 
wage  (they  are  rationed),  and  non-graduates  receiving  E[J{ij)\F{x)  <  a]  as  before. 

Consider  an  increase  in  the  information  generated  by  school  screening  in  the  sense  of  (MIO- 
SPM).  This  will  increase  the  wage  for  graduates  and  decrease  the  wage  for  non-graduates.  But  it 
will  also  increase  the  total  production  of  the  economy  by  leading  to  better  matching  between  high- 
skill  workers  and  skill-sensitive  jobs.  Moreover,  as  the  fraction  of  skill-sensitive  jobs,  /?,  increases 
toward  a,  the  social  returns  to  better  screening  by  the  schools  increase  (assuming  the  planner  cares 
only  about  gross  production  and  not  inequality).  Similarly,  if  S{(jj,t)  is  supermodular  in  r,  then 
the  social  returns  to  better  screening  will  be  increasing  in  r. 

6.4     Application:  Coordination  Under  Uncertainty 

This  section  considers  a  game  where  both  players  can  choose  to  acquire  information.  The  game 
involves  two  players  with  symmetric  payoffs  «(a;,  Oi,  aj)  where  Gj  is  player  i's  action.  Player  i  receives 
signal  ttj,  with  joint  distribution  F^'{u},ai),  where  F^<(q:j)  =  aj  and  6i  reflects  signal  quality. 
Assimie  the  ctj's  are  independent  conditional  on  u.  We  let  u(a;, 01,02)  =  ai[a;  +  702  +  K]  —  ^ral, 
and  restrict  r  >  7  >  0.  In  this  supermodular  game,  players  make  unobserved  choices  of  signal 
quality  (^1,^2)  and  then  choose  their  strategies  (01,02).  We  look  for  symmetric  Nash  equilibria. 
Conditional  on  the  information  structure,  Player  1  maximizes 


Jn  L 


ai[cj  +  'yE[a2\cL>;  e]+K]-  -Taj 


dF^'{uj\ai) 


to  find  an  optimal  action  ai(ai)  =  ^  [S[a;|ai;0]  -f-7£?[o2|ck:i;^]  +  K].  In  the  unique  (essentially) 
symmetric  equilibrium,  ai{x)  =  ^r^r  [£^[a;|ai;  Oi]  +  K]. 

The  first  question  to  ask  is  when  the  optimal  poUcy  will  be  monotone  in  aj.  Clearly  requiring 
that  the  posteriors  are  ordered  by  FOSD  is  sufiicient;  actually  all  that  is  needed  is  that  the  posteriors 
be  ordered  according  to  their  first  moments.  Since  any  two  distributions  can  be  ranked  according 
to  their  means,  assimiing  {MDP-!F)  is  simply  notational.  The  linearity  of  payoffs  arises  since 
when  j  plays  her  equilibriimi  strategy,  £^[aj|w]  =  73^(0;  -I-  K],  and  so  Ui{ai,u})  =  a-:p^[u}  -\-  K]. 
Thus  marginal  retioms  are  of  the  form  Ua{u})  =  c  +  duj,  where  d  >  0.  The  set  of  marginal  returns 
satisfying  these  restrictions  forms  a  convex  cone,  with  corresponding  (MDP)  and  (MIO)  conditions. 
The  condition  for  monotonicity  of  the  policy  is  that  £?[a;|at]  be  increasing  in  Oj.   The  condition 
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under  which  G  is  more  informative  than  F  for  all  decision-makers  with  linear  marginal  returns  can 
be  derived  immediately  from  Theorem  3:  Va  G  [0, 1], 

E[uj\F{x)  <a]>  E[oj\G{y)  <  a].  (MIO-LIN) 

This  is  the  linear  monotone  information  order. 

We  now  consider  the  information  acquisition  choice.  Assume  that  6i  indexes  Fi  according  to 
{MIO-LIN).  Expected  payoffs  to  player  i  are  Il.{6i,6j)  =  E[u{uj,ai{ai),aj{aj))\9i,62],  so  the  sym- 
metric equilibrium  must  also  satisfy  n,(^i,0i)  —C'{6i)  =  0,  the  first  order  condition  for  information 
acquisition.  We  are  also  interested  in  how  changes  in  the  parameters  alter  the  amount  of  information 
gathered  in  equilibrium.  Straightforward  algebra  yields 


{u;  +  K)-^{E[uj\ai;ei]+K) 


Comparative  statics  follow  immediately  from  an  application  of  Theorem  10.  A  change  in  K,  the 
known  marginal  benefit  to  acting,  has  no  effect  on  the  amount  of  information  gathered.  An  increase 
in  the  quadratic  cost  of  action,  r,  decreases  the  amount  of  information  gathered.  And  as  is  intuitive, 
an  increase  in  7,  the  returns  to  coordination,  increases  information  gathering.  Similarly,  agent's 
actions  are  increasing  7  and  K  and  decreasing  in  r.  Endogenizing  the  information  structure  in  this 
game  reinforces  known  complementarities. 

7     Conclusion 

In  this  paper,  we  have  obtained  a  new  set  of  results  for  the  standard  Bayesian  decision  problem. 
We  have  provided  a  general  analysis  of  when  one  signal  is  more  valuable  than  another  signal  to  a 
given  class  of  decision-makers,  and  also  provided  conditions  under  which  decision-makers  within  a 
given  class  will  differ  systematically  in  their  marginal  value  for  information.  We  have  applied  this 
general  framework  to  several  payoff  classes  of  particular  economic  relevance,  and  described  their 
corresponding  monotone  information  orders.  As  we  have  attempted  to  point  out  in  the  previous 
section,  the  results  can  be  applied  to  a  variety  of  economic  settings,  allowing  characterizations  of 
what  sort  of  information  is  valuable  in  a  given  envirorunent,  and  which  sorts  of  agents  caxe  most 
about  acquiring  it. 
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An  extremely  interesting,  but  potentially  difficult  extension  of  this  research,  is  to  analyze  the 
value  of  information  in  strategic  settings.  In  such  a  setting,  information  may  have  both  a  direct 
and  a  strategic  value.  In  standard  incentive  problems,  information  has  only  strategic  value,  while 
in  Persico's  (1997)  study  of  auctions,  information  is  acquired  secretly  so  there  is  only  a  direct  value. 
In  the  industrial  organization  literature  discussed  in  Section  6.1,  results  combining  both  direct  and 
strategic  benefits  are  obtained,  but  they  typically  rely  on  restrictive  functional  form  assumptions. 
The  examples  from  this  literatiure  suggest  that  unambiguous  results  about  the  value  of  information 
may  not  be  available  in  many  settings.  It  remains  to  be  seen  whether  general  characterization 
results  can  be  obtained  in  game-theoretic  models. 

8     Appendix 

Proof  of  Lemma  5.  Assume  first  that  A  is  finite,  A  =  {ai,  ...,an}  with  Oj+i  >  Oj.  Then  for  any 
monotone  nondecreasing  a{a),  there  exist  ao  =  0  <  ai  <  ...  <  an-\  <  ctn  =  1  such  that  a{a)  =  ai 
on  [q;j_i,q;j].  Then 

/     /      u(LO,a{a))dm{u),a)     =      /  ^^^u{uj,ai)  [m{ai\u))  —  m{ai-i\uj)]  dm{uj) 
Ju  J[Q,l\  Jn  ^ 


u(u! ,  an)'m(l\ui)  —  u(lj  ,  ai)m(0\u))         | 

I     I  -1  f  ^^ 

'"  I  -Z)r=i  [u{oJ,ai+i) -u(u},ai)]m{ai\uj)  J 

'n-l  ^ 

y~^  [u{uj,ai+i)  -  u{uj,ai)]  m{ai\uj)  >  dm{uj) 


=    -^2      ri{uj)dTn^{u},ai 


) 


for  some  ri(a;),...,r„_i(a))  with  ri{uj)  G  C/j.  The  third  equality  uses  the  fact  that  7n(l|a;),  m(0|a;) 
are  zero  a.e.  with  respect  to  Tn{u)).  The  last  step  follows  from  (MDP-U).  Then  clearly  (16)  implies 
(15).  Moreover,  suppose  (16)  fails  for  some  a  G  (0, 1),  r  €  Ui.  Then  define  u{u,a)  =  af{u)).  And  let 
a{a)  =  ai  on  [0,6:)  and  a{a)  =  On  on  [a,  1].  By  (MDP-U),  u  E  U2,  and  a  is  clearly  nondecreasing. 
Substituting  into  the  derivation  above, 

/    /      u(a;,a(Q:))dm(a;,a)  =  — (on  —  oi)  /  f{uj)dajm(u,&)  <0. 
Jn  J[o,i]  Ju 

So  (15)  must  imply  (16). 

Now  suppose  A  is  compact.  We  know  from  above  that  (16)  is  equivalent  to  (15)  holding  for  aU 
a{a)  step  functions.  We  show  that  (15)  holds  for  all  nondecreasing  functions  a{a)  if  and  only  if  it 
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holds  for  all  step  functions.  Let  a{a)  be  some  nondecreasing  function,  and  let  a'-(a),a^{a),...  be  a 
sequence  of  step  functions  converging  to  a{a).  Then  since  u  is  continuous  in  a,  then  u(ui,a''{a))  will 
be  converging  to  u{u},a{a)).  And  since  u{u),a)  is  bounded,  we  can  apply  the  Lebesgue  Convergence 
Theorem  to  show 

/  u(u},a^(a))dm{uj,a)  — >   /  u{ij,a{a))dTn{uj,a). 

Jnx[o,i]  JnxlQ,!] 

Then  clearly,  (15)  will  hold  for  all  a  nondecreasing  if  and  only  if  it  holds  for  all  step  functions.    D 
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Figure  1:  Sufficiency  with  3  states  and  2-point  signal  distributions. 


States:  Q  =  {6),, Oj.'ys}-  Assume  «,  <6t}2  <<i)3.        Prior:  (y,j,j). 
Two  signals,  x  and  y ,  where  y  is  sufficient  for  x. 
Each  signal  has  two  (equally  likely)  possible  realizations: 

^  =  {x'^,x"],1f  =  {y';y"),  ?r{x  =  x'')=?v{y  =  y^)=\ . 


00. 


<o,i^^ 


GO 
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Figure  2:  Indexing  Posteriors  by  Ex  Ante  Percentile 


a=F(xJ,a  =  G(>'J. 
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Figure  3:  Illustration  of  Monotone  Decision  Problem  (MDP)  Conditions. 

States:  Q.  =  {(Dy,C02,C0^}.  yi:  =  {x';x"}.  The  posteriors  generated  by  x  are  illustrated  in 
the  diagram. 

Case  1:  f/2  is  set  of  supennodular  functions. 

F(co\ x)  satisfies  (MDP)  only  if  F(0)\x")  >>05o  F((Ol x^)  for  ailx">  x*-. 

Given  F(0)I x^)  as  illustrated  in  diagram,  only  posteriors  within  solid  lines  satisfy 

(MDP)  restriction. 
Thus,  X  is  admissable. 

Case  2:  U^  is  set  of  fiinctions  with  single  crossing  incremental  returns. 

Fi(0\x)  satisfies  (MDP)  only  if  F((o\x")  J>-^„  Fi,(0\x^)  for  all  x"  >  x^. 

Given  F{o)\x^)  as  illustrated  in  diagram,  only  posteriors  within  dotted  lines 

satisfy  (MDP)  restriction. 
Thus,  3c  is  not  admissable. 
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Figure  4;  Supermodulai  Monotone  Information  Order  with  3  states  and  2-point 
signal  distributions. 

States:  Q  =  {(O^,(O2,C0^}.  Assume  6),  <«2  <6>3-  Prior:  (j.j.j)- 

Three  signals,  x,y,andz,  where  z  yuio-spu  5'  >-mio-spm  ^  - 

No  two  signals  are  ordered  by  sufficiency. 

Each  signal  has  two  (equally  likely)  possible  realizations  which  satisfy  MDP-SPM. 

Robustness:  If  F((0\x'')  is  anywhere  within  the  dotted  region,  5'  >-mio-spu  ^• 


t4%i0} 
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Figure  5:  Illustration  of  the  Supermodular  Monotone  Information  Order. 

States:  Q.  =  {0)^,0)2,(02}.  Assume  o,  < Oj  < 6)3. 
Two  signals,  y  and  z ,  where  z  >-mio-spm  V- 
Moving  from  ^  to  z  moves  probability  weight  onto 

(Low  signal,a)^  and  {High  signal.O)^),  and  away  from 

(High  signal,0)^  and  {Low  signaUo)^). 

Change  in  Probability  of  event  from  j  to  z 
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